The New York Times has an article about the role of libraries in the age of Google. While the article makes a number of interesting points, the one that caught my eye first was this one:
For every archive that has become searchable by commercial Web engines, scores are not accessible. “There’s lots of great stuff that isn’t available digitally and likely never will be,” Dr. Janes said. Most books published before 1995 fit into this category, he said, as do many older magazines, newspapers and journals, as well as historical maps, archives, letters, diaries, older census statistics and genealogical materials.
While the statement might be true, what is the reason for it? Project Gutenberg is working feverishly to add books to its collection. They have over 10,000 books, including seven different versions of The Odyssey. Unfortunately, they are unable to include most texts published after 1923 due to copyright laws. If the New York Times article is correct, and most researchers simply won’t bother looking for stuff that’s not available on the Internet, then what will happen to data from this 1923-1995 post-copyright / pre-Web gap? Will it simply be erased from our memory?
Some materials from that period, such as journal articles, are making their way online through resources like JSTOR. However, I can’t help but feel that books are potentially a weak spot. It’s arguable that the period from 1923-1995 is the golden age of books: the technology to produce books cheaply had become widespread, and books hadn’t yet been superseded by other resources such as television and the Web. It will be a shame if all that knowledge becomes lost because of shortsighted copyright laws.