Genealogy Blog by Heritage Creations: "500 Billion Web Pages Unsearched by Google!!??
Leland Meitzler (email) -- 2:52 pm Wed, Jun 23, 2004
An article by Katie Hafner for the New York Times again got me to thinking about online research as compared to traditional library research (brick and mortar). While the web is a wonderful place to research, search engine limitations have a serious limiting effect on our ability to access data - even digitized data. Don�t count on Google to find it all. The following excerpts are just a taste of a very interesting article. You should read the whole June 21, 2004 New York Times story.
For the last few years, librarians have increasingly seen people use online search sites not to supplement research libraries but to replace them. Yet only recently have librarians stopped lamenting the trend and started working to close the gap between traditional scholarly research and the incomplete, often random results of a Google search.
The biggest problem is that search engines such as Google skim only the thinnest layer of information that has been digitized. Most have no access to the so-called �deep Web,'� where information is contained in isolated databases like online library catalogs.
Search engines seek so-called static Web pages, which generally do not have search functions of their own. Information on the deep Web, on the other hand, comes to the surface only as the result of a database query from within a particular site.
�Google searches an index at the first layers of any Web site it goes to, and as you delve beneath the surface, it starts to miss stuff,'� said Duguid, the Berkeley researcher and co-author of �The Social Life of Information.'� �When you go deeper, the number of pages just becomes absolutely mind- boggling.'�
Some estimates put the number of Web pag"