Home Page

 


EARLIER FEATURES

 


FEATURES CONTENTS

 


LATER FEATURES

 

Features Contents


10th November 2013

PRESERVING HISTORY FOR THE FUTURE

Brian Grainger

email.gif (183 bytes)
brianATgrainger1.freeserve.co.uk


 

What is the second most important site on the World Wide Web?

I think we can all agree the most important web site is your search engine – otherwise we are limited to only finding those things that web site owners want us to. I suspect what is considered second in the hierarchy will depend on who you ask. In this short article I want to put forward my personal choice and why.

You may think that I would rate the ICPUG web site as second in importance. However, I am not that egocentric. It only serves a specialist section of the community. My choice serves the whole world, well maybe limited to the English speaking parts of it, although we may not use it that often. It certainly isn't one of those sites, such as Youtube, Facebook or Twitter that has perceived importance by high usage numbers or how much money it can draw when shares are offered. No, my choice provides a function that I consider to be vital even if many people may consider it as not important.

Have you researched your family tree? Do you like historical dramas on TV? Do you believe we can learn from the mistakes of the past? Personally, I can answer 2 of those questions in the affirmative. If we want to do any of those things we need to have a record of what happened in the past. Genealogists need records of births, marriages and deaths. Historical drama, if it is to have realism, must be based on on events that actually happened and those events must be recorded for those dramatists to use in their screenplay. Unless we have records, mistakes of the past, especially by politicians, will be erased from the memory. The key then is recorded information.

Most of the recorded information that exists today is in the form of writing on paper or its equivalent. In the earliest times the paper may have been the cave walls and the writing in the form of drawings but the principle is the same. We have built edifices of various sorts: libraries for books; churches and registries for birth, marriages and deaths records; natural history museums for biological records; science museums for scientific endeavour. These all serve as archives for material of a specific kind to which others in the future can return and research what is their history.

As we are urged more and more to dispense with paper what will become of archives for the future? Before the internet, in order to save the space necessary for paper records, we started to use microfilm. Genealogists will be very aware of this when they research records and old newspaper cuttings. Microfilm depends on microfilm reader machines for them to be viewed and this is a problem. Will the machines be available in the future? When the BBC decided to create an archive for opening at the millennium, (year 2000), they used laser disk technology. Unfortunately, they neglected to preserve a reading machine and the technology was long gone by year 2000.

Today, to preserve anything, it is usually digitised and stored on the currently fashionable medium. Now that the internet is here much new writing is created digitally and stored on servers throughout the world. However, these servers are not forever. What happens when a company owning some servers cease to exist? The servers are taken off line and the stored data ceases to exist. One of the early internet communities, where many people wrote stuff, was Geocities. After a number of years Geocities was closed down and much of that writing was lost. The same could happen to the ICPUG web site. All the stuff we write here will, someday, be lost. The stuff written in the (paper) magazines will remain as long as someone has copies of those magazines. Here on the internet it can all be lost in a single day. Now I am not saying that the stuff written here is a great work of art but who is to say what is important or not and what should be preserved. On November 8th 2013, on the TV programme QI (K series), I heard Stephen Fry telling the viewers that a kilobyte was 1000 bytes, all because the IEC said so. Indeed he even deducted 10 points from the audience because they said it was 1024 bytes. Readers of this site, in particular the article 'I've Got a Bigger Gigabyte Than You', will know that the audience was right. A kilobyte was 1024 bytes and the IEC just redefined it. That article has been referenced elsewhere on the web. If the ICPUG servers disappear we might all start believing the IEC and Stephen Fry! For that reason alone I think some of this stuff is worth preserving. I jest, but I hope you get my point.

My choice of second most important web site is that which has been set up to preserve the information of today, on the World Wide Web and elsewhere. Where is that web site? – http://archive.org .

I first heard, quite a while ago, of web pages being preserved on the 'Wayback' machine. My first thoughts were of scepticism. The web was growing rapidly at millions of pages per year. How could you possibly backup not just one copy but multiple copies of the web so that all page changes are preserved? I'm not sceptical now - I rely on the Wayback machine. My employer, under the context of security, has restricted access to the web so that virtually all categories of web sites cannot be viewed from our desktop PCs unless they have been cleared by the IM department. This means when I have a problem with Windows, Word or Excel I can no longer view the major sources of information to help solve problems. When I wish to seek out some information about certain components or perhaps techniques of analysis I am blocked. The most galling message that keeps coming up is 'this site has been blocked categorisation: education'. My employer does not want me to be educated! After a particular Word problem arose, which I really needed solving, and Google suggested a page on a blocked site that might help, I suddenly remembered the Wayback machine. I checked and http://archive.org was not blocked! I checked to see if the Wayback machine had any archives of the page I wanted to view. They did and I viewed them! Problem solved.

The technique described above would also work for pages that have disappeared off the web, for whatever reason. If a politician wants to say something one year and then deny it the next and the Wayback machine has got a copy of the first speech, the politician can be exposed as a charlatan.

For those pages that we do not know are important until further down the line, like that article on the kilobyte, we now know that a record may be preserved on the Wayback machine.

Of course, not EVERYTHING on the web is backed up by the Wayback machine. It has to obey the law and it will only preserve pages that allow robots to crawl their contents. Crawls tend to be done at irregular intervals and can be up to 6 months apart so if information appeared and left within the crawl gap it would be lost. Nevertheless, it is amazing what can be found on the Wayback machine.

The Wayback machine is run by a charitable organisation operating out of San Francisco and if you want to find out about its aims and some background as to how it backs up the web I suggest you read their web pages.

In recent times the organisation has extended its remit to preserve e-books, video and most recently classic software!

The book section is the next largest area after the web. It stores out of copyright books and various collections the organisation has received. As I see my local library gets modern and moves further away from storing books I wonder how today's children are to be inspired, like I was, by the books I found there. I love mathematics and part of that love grew out of the mathematical puzzle books I found in the library. I learned new things while puzzling. Today, if I went into the library I wouldn't find any of those books. However, I recently went on the Internet Archive's book section and found some of them to download as pdf files. This is an amazing resource and together with the links to the Gutenburg project, which has been digitising books for some time now, there is quite a collection of literature that is preserved.

The specialist section of the community that reads this web site may be interested to know that there are technical manuals on Commodore machines and peripherals in the Internet Archive of books!

The software section has only just started but it is looking interesting. It seeks to preserve those classic programs of yesteryear. It does this in two ways. For some programs it has created the program plus emulator to run on a PC. In this way classics from the Atari, Commodore and Apple era can be preserved. Visicalc, the first spreadsheet program for the Apple computer has been preserved this way. You will also find some classic games of the time. The software section also holds classic PC programs, unchanged as they do not need an emulator.

Another branch of the software section is to preserve copies of operating systems. Linux distros form a major portion of this section and at the moment Puppy Linux is represented in a big way. Not sure why – it seems to be overly represented to me and I am a Puppy Linux fan! Clearly, this section has to grow and clarify its purpose. If some diversity appears in the future this section of the archive can become very important.

I think you can see that the archive is performing a very useful function and perhaps could become a target for those groups that do not want to see records being preserved. Purely by coincidence, on the eve of my writing this article the organisation suffered an extensive fire at their San Francisco HQ. The data on the servers has not been lost but digitising hardware worth over half a million dollars has been destroyed. The tin foil hat brigade have already suggested it is the work of the American security service, NAS! Whatever the reason, this fire is going to have an impact on the ability of the Internet Archive to do its work and the organisation is currently looking for funds to re-equip its premises. I have found the archive very useful. I consider the purpose of the archive to be essential. I hope they soon get back up to speed and look forward to their future ideas for preservation of our legacy.


 

 

 

 


TOP