Archival Milestones

Below are just a few of the milestones I have made over the years with regards to archiving. There are many more which are not yet listed. As time goes on, I will be adding to this page considerably with more past, present, and future archival efforts.

My archives will eventually be made public, but not at this time due to size and storage issues. They contain millions of files, and the logistics of hosting such data online are quite complex. For the moment I am merely concerned that the data has been salvaged. Furthermore, not all of the information I archive is intended to be re-distributed in its raw form. It is often the archivist's role to reformat preserved materials in order to re-contextualize them in new ways.


Independent PC Games (Dec 2009, Jan 2011)

With large corporate-centric publishers shunning risky projects, and home consoles makers demanding flashy big-budget games, projects with creative art direction and experimental ideas have had an uneasy transition. Around 2005, the Independent Games industry started to picking up traction. More tools were becoming available to the average user, and coding competitions helped to blossom new communties centered around game development. As of 2011, the indie game space has matured to the point where a single individual or a small team can output high quality games. As a result, the Independent PC games community has given rise to a wave of artistically driven titles.

I started compiling an archive of games which include:
  • Freeware Games (Thoose which previously were sold at retail, but were later made free by Publishers)
  • Doujin Shooters (Typically Japanese developed Amataeur PC shooters. Some are commercial and some are free.)
  • Digitally Re-released Classic PC Games (such as those from gog.com)
  • Embedded Browser Games (flash/java games that run in the browser)
The number 1 requirement is that all content be "DRM Free".

Piracy has always been the biggest conflict on the PC, and companies have desperatly struggled to maintain an iron grip on their content while still leeching money from the consumer. Online Marketplaces such as Steam and PSN allow the user to download a game and play it, for a price of course. Though, in most cases the user cannot backup the content, and the games are teathered to the company via an authentication service in order to even boot to the start menu. The user obtains a vague feeling of content ownership, but it is more like an extended rental program.

Since I still play games from 20 years ago, and I would also like to play games released today 20 years from now. Therefore, I only strive to archive games where long term preservation is possible.
Superplays (Jan 2011)

I spent much of this time archiving roughly 150GB of video, displaying extreme examples of mastery in video games. I am constantly researching the limits of human interaction with digital media. Going beyond the mere endurance runs of the 80's, Arcade games emerging from the 1990's push the limits of human dexterity and focus, challenging hand eye coordination like never before. Here, the players goal is not merely to finish the game, but rather to do so with style, finishing with a 1CC (1 Credit Clear) or on just 1-Life, constantly striving to top almost unthinkable hi-scores.

The 90's brought forth such landmark titles as: Battle Garegga (Raizing), Rayforce (Taito), Virtua Fighter 2 (Sega), Metal Slug (SNK), DoDonPachi (CAVE), Ridge Racer (Namco), and Street Fighter II Turbo (Capcom).

These are just a few of the hundreds of titles released during the 90's, each containing unrivaled art direction, entirely original soundtracks, and a high degree of challenge. The player receives the maximum result possible in visual, audible, and physical feedback.

Homebrew Archiving (Nov 2010 to Dec 2010)

What started out as a simple excursion into bypassing security protection, evolved into a massive salvage operation. In November I started looking into how to upgrade a specific game console to a higher firmware revision without upgrading through the companies official service. What I found was a fragmented and dispersed documentation structure, scattered across the internet, with seemingly simple tutorials and media files crammed into digital storage lockers, message boards, blog posts, and video services like youtube. Parsing through this information for answers was incredibly time consuming, so much so that the notion of a central repository of information and files seemed more like a mirage. Long story short, due to the current generations inability to grasp simple concepts like HTML and hosting a personal website, intellectual data is being wedged into a myriad of CMS systems, resulting in a confusing web of distributed information.

Thus, I started compiling a local repository, bringing order to chaos by stitching it together from each forum thread or download link I came across. For the greater part of December, I focused on the Playstation Portable and its homebrew community. Working for 10 days straight, I only managed to cut through maybe 1/3rd of the total content available. Which is staggering. I underestimated the breadth of applications and intellectual material that the "scene" has managed to produce, and it has motivated me to apply these same archival methods to other console platforms as a result. Over the course of this time period I archived over 80 Applications, 50 Games, 58 Emulators; while documenting the author names, dates of release, development history, and the files themselves. This includes everything from TI-92 Calculator Emulators to Seismographs, to Audio Recording/Mixing software, Japanese Dictionaries, GPS Applications, as well as many apps which perform technical tasks that we perform on computers like FTP, IRC, Email, and SSH.

The fact that source code exists for many of these releases, along with the availability of community coded developer tools, startling opportunities become available for anyone (artists, musicians, game creators, etc) to use this technology in new and creative ways.

Link Harvesting (Sept 2010 to Oct 2010)

Due to the limitations of modern search engines, any given search is restricted to 1,000 viewable results, regardless of the total result count it displays.

Breaking free from the limitations of internet search, I processed roughly 170 websites I had already archived and harvested external links from them, resulting in a list of over 65,000 URL's. This was a rather involved process comprised of many steps, including an initial 404 check, duplication removal, and more.

This new method of traversing the internet allows for the discovery of relevant media that might never be located otherwise. Websites tend to link to material that is topically relational. Instead of performing a search, going to a link, and returning to search, I devised a new method of exploring the internet, by traveling from pathway to pathway. This isn't necessarily a new concept. I am essentially "surfing" the internet. However rarely is this processed used as the overlying focus exploring the web.

From the new sites I archive in this batch, I will harvest even more links, and so on, and so on down the line.

Video Archiving (July 2010)

Up until this point I hadn't archived much video content. I saved some material that I came across over the years, but I hadn't sat down to specifically target online video repositories. So I had a bookmark list of about 20 videos from youtube, and decided to archive those first. Well, after I performed a few searches, I began to runing out of queries. At this point I came up with a new idea.

Bypassing search, I decided to start at 1 youtube video, and examine all of that users uploaded videos, as well as the related video list. If I found anything worthy of saving, I would append it to my list... and I procedurally worked my way down the list checking these related resources on each video. The list grew and grew. Analyzing 10 individual videos in such a fashion yielded 200 more added to the list.

This process has been leading me to a wide breadth of intellectual material. Archiving a game video, might lead me then to demo scene videos, and then to videos on space, then data visualization, then video art, then educational lectures, then mathematics and physics demos, and then robotics. Each topical leap off point thrusts me into a new pool of related material.

It takes me 1-2 days to build a list of about 200 videos, and another day to run an automated batch download job.

Page Builders aka Free Subdomain sites (June 2010 - present)

Having lost a battle with Geocities in Fall of 2009, I took a hard look at the archival projects I had undertaken thus far. Many were initiated only by a pending termination notice. In the last 3 months I have been working towards a more preemptive approach, where I assess the value of a site regardless of its vulnurability, and archive it appropriatly.

The first project to fall under this new methodology is to salvage sites from Free Hosting services which rose in popularity in the late 90's. I will be hitting a brick wall of over 57 million websites from such notables as Tripod, Angelfire, and Homestead. These resources are some of the best locations to trackdown websites over a decade old.

As yet to be stated on this page, I do not practice what I refer to as 'blanket' archiving; where one would try to salvage 'everything'. You can't just go into Montana, select a section of land and say "move this to the lab, all of it". While you do end up saving every available resource, it can be a complete mess with no distinction of quality. With these free sites especially, there is a mountain of garbage involved. Many free site services were exploited to generate "link farms" in order to increase a specific sites google search ranking. Others are guestbook pages that have been systematically injected with spam links over the past 10 years.

A paleontologist would have to be far more selective. As such, I practice 'selective' archiving where I determine the value of any given site before saving it.

Scans at Retromags (June 2010)

Many websites vanish in the blink of an eye because of "life conflicts". The admin simply cannot continue maintaining the site due to either time constraints, lack of interest, or sometimes the psychological breakdown that acompanies moderating community forums. A fantastic website dedicated to scanning and distributing old Video Game magainzes was on the verge of being shutdown recently. While someone else has stepped up to take over the site, such situations can make one such as myself very nervous.

So I finally sat down to archive the material compiled by Retromags.com. They provide over 700 magazine scans at roughly 60GB in total. I used a few tricks to speed up the overall download process. I spent June 5-6th building a file link list, and then my home machine downloaded the files for a solid week straight.

Key reasons why this material is important to me: (1) I am a graphic designer. So studying print designs tailored around technology from the last few decades really interests me. Everything from type faces, to hand drawn art, to early 3D renderings can be found by flipping through the pages. (2) Furthermore, you can get an interesting insight into perspectives of the time by reading the "mailbag" sections. (3) Finally, I am always on the hunt for interviews. Hard facts coming 'straight from the horses mouth' are more credible than opinion pieces or the canned marketing garble that fills the majority of such periodicals.

At first glance you may not comprehend the significance of this pooling resource. Have you ever tried scanning an entire magazine before? This can take quite a long time depending on the speed of your scanner. To scan in 700 magazines would take thousands of man hours. So being able to knock out this much material in under 9 days of work (most of which was automated downloading) has been a very efficient little process, in large part due to the efforts of those at the RetroMags site who scanned in each of these issues.

Parsing the GIA, and watermark tracing (April 2010)

The Gaming Intelligence Agency was a website which ran from 1998 to 2002. Luckily, this website was archived by another individual and currently resides at the following URL:

http://www.psy-q.ch/mirrors/thegia/sites/www.thegia.com/

At first glance there isn't anything particularly special about this website. It reviewed games, posted news, as well as screenshots and video clips. I believe this website was trying to turn itself into an actual cash driven enterprise, but it did not generate much of its own original content. Instead it would syndicate news from around the web, as well as harvest screenshots and videos from other websites of the time period. Right here is where this site becomes significant. The current archive of the GIA is more or less a accumulation of content from other e-zines at the time, whose content is no longer available.

I initially archived the entire site, but after the routine had been running for almost 2 days, I realized its number of files and overall space consumption did not justify its worth. Taking a glance at the material, I only needed a portion of the site content. So I picked through the files copying hi-quality art assets, screenshots, and worthwhile videos to my media archive. I manually sifted through about 150,000 files.

During this process I documented "watermarks", the logo overlays that websites place on game related images to mark their virtual territory.

The notion of a generic video game screenshot with an overlayed logo in the corner isn't all that interesting in and of itself. However when you place it into a proper context, the idea of tracing these logo imprints to their websites of origin becomes really exciting. Here it is much less about the content I am salvaging, but pathways which branch off from its study and analysis. I was able to match over 35+ image watermarks to their originating domain names, half of which were sites I had no prior knowledge of.

This method of research is yet another way to locate extinct websites which obviously would not appear in modern day search engines.

Playstation 2 Linux (Dec 2009)

A complete archive of the Playstation 2 Linux US community site. Including 7 years of subversion files, code samples, binaries, demos, and the entire 50,000 post message board. A huge resource for technology history. Playstation 2 Linux was more than just an operating system. It served as a PS2 development kit which allowed amateur programmers to become familiar with coding for the console. Many people acquired jobs in the industry from their experience with this. My reason for salvaging this body of work, is that it has value as a creative tool. By losing over 7 years of built up research, demos and updated system files, those looking to experiment with PS2 Linux would be starting off at square one.

GameSpy Public Sites (June 2009 - Sept 2009)

In 1996 the first major online game ever was released with "Quake" by id Software. From this a new community sprouted called Planet Quake. The site creators at the time were tired of seeing community sites go offline or disappear. So they allowed the public to host their Quake sites at Planet Quake, free of charge. While this was facilitated by revenue from simple banner ads in the header of each site, the seeds for preservation had been planted. Fast forward to 13 years later and Planet Quake is now a media conglomerate, known as GameSpy.

A calious decision was made in 2009 to terminate ALL publically hosted websites. So I took it upon myself to archive this material. What few realize is that this was essentially 13 years of PC gaming history, with loads of documentation on modifying game engines, custom coded tools, and historical records of the evolution of PC gaming. GameSpy was shutting down the largest central repository of classic computer game sites on the Internet. Some sites that literally havn't been touched since 1997. This project took me 3 months of work to complete. I managed to salvage carbon copy complete archives of over 850+ websites, with over 1,350 documented URL's for further back tracking to the web archive.

Google 2001 Search Index (Oct 2008)

For 1 month, Google put up a carbon copy of their oldest most complete search index. This allowed me to search the Internet like it was 2001 again, with all entries linked to their data in the web archive. Currently, there is no way to search the archive.org's web archive. So this came as close to traveling back in time as you could get. Needless to say, I took full advantage of the opportunity while it was available, and crunched the entire month archiving content I would have otherwise been unable to discover.

Unfortunatly this capability was only available for a limited time period of 30 days.

Front Mission Online (Mar 2008 - May 2008)

Traditionally a turn based strategy series, Front Mission makes its first foray into the world of real-time Mech combat with Front Mission Online. (Mech Games or 'Mecha' is a genre of games which involve large mechanized vehicles, typically bi-pedal in nature, which are controlled by a human pilot.) A Massively Multiplayer Online game, FMO was released exclusively in Japan for the Playstation 2 and PC in May of 2005. Three years later, the game was shutdown on May 31st, 2008.

Having only 3 months of lead time, my focus went towards Documentation:

The official website had to be archived manually due to its use of root relative linking and Javascript includes. The site would not archive properly with archiving software such as wget. Therefore I had to save the 1,600 file website by hand, which is now the only known complete archive of this site currently in existance. (Though propogating such material will one day change this.)

The official website had a 'community links' section. I used this as a starting point, and charted my way across the many communities and personal blogs relating to the game. In total, by the time I was done, I managed to salvage over 5,000 images and 15GB of video footage pertaining to active online play. With the game now gone, this serves as a visual record of human interaction within the world of FMO. The game can never be played online again, and as such, these records are all that remain.

Playstation BB Network (Dec 2007)

Exclusive to Japan, owners of the PS2 Harddrive and BB Navigator Software could connect to "channels" for various game publishers. These were sort of like websites, but they actually took advantage of the PS2's 3D rendering engine. I pre-emptively documented these before knowing when they would be terminated. Less than a month later, the channels were no longer available. I would have liked to have logged network packets to try and archive the files themselves, however it was pulled offline before I had the chance. In December of 2007 I managed to at least capture video documentation of navigating the interfaces and content for each BB Channel.

Server Preservation & Restoration for Console games (2003 - Present)

Tieing into my website at OnlineConsoles.com, I have been working to preserve and restore online functionality for console games since 2003. My role is that of a technical researcher, often providing network data and project frameworks for programmers I collaborate with, who then attempt to rebuild network infrastructures for since terminated console games.

While it is more common for PC games to have their network functions restored, only four cases of true reverse engineering are known to exist for home console software. Two of which I have been involved with: Starlancer (Dreamcast), and Tribes Aerial Assault (Playstation 2).