[osol-discuss] The value of deep Web search

jim Mon, 28 Jul 2008 03:25:34 -0700
Today’s search engines are marvelous research tools; however, searches often 
yield more trash than treasure. Sifting through the junk to find the gems can 
consume large amounts of time. It is noteworthy that the majority of users are 
frustrated by search engines, Chamy has found that “Web-rage is uncaged after 
twelve minutes of fruitless searching.” A typical keyword search may uncover 
millions of “hits.” Even fine tuning, by tweaking your keywords and using the 
advanced search features of search engines, can yield results that are less 
than desirable. More importantly, however, is the vast amount of information 
missed by search engines. It is in these situations where the deep Web can be 
of help. The deep Web is not a substitute for surface search engines, but a 
complement to a complete search approach.
The imagery used for the Web is a spider’s web that covers the planet. Search 
engines are the spiders that crawl all over the Web to extract and index text 
from websites. Hence, these search engines are called spiders or crawlers. 
Surface search engines crawl from static web page to static web page to extract 
text from HTML then index these words. Information stored in databases is not 
in a format these search engines can access. Databases are accessed dynamically 
by queries using the retrieval tools unique to the database. An analogy would 
be that surface search engines can see all the birds floating on the ocean, but 
can not see the fish. You need sonar to look through the depths of the water to 
see the fish and a fishing pole or net to catch the fish.
Bergman contrasts these two parts of the Web:
Surface Web     Deep Web
Millions of web pages   Over 200,000 databases
1 billion documents     550 billion documents
19 terabytes    7,750 terabytes
Broad shallow coverage  Deep vertical coverage
Results contain ads     Results contain no ads
Content unevaluated     Content evaluated by experts
If you know the URLs of deep Web databases and understand what information is 
contained in these databases, you can access the deep Web information. However, 
with hundreds of thousands of databases, and more being added daily, this can 
be a daunting task. Fortunately, elves on the Internet are busy at work 
creating portals to this information. Also, surface search engines are 
beginning to add small quantities of deep Web content to their searches. 
An example of a deep Web resource would be the NLM Gateway sponsored by the 
National Library of Medicine (NLM). Go to this site and type in some keywords. 
The quality of the medical information you will find in seconds will surpass 
anything you can find by searching for hours on the surface Web. This example 
illustrates the value of the deep Web. The secret is in knowing where to look. 
Part of the purpose of this presentation is to guide IT professionals to some 
of the best places to find deep Web content.
Tools and Websites
Various tools and websites from both the surface Web and the deep Web are 
included in this presentation. This is not a comprehensive listing, but a small 
select list of high quality resources for the field of information technology. 
To list everything available would not be possible, nor would it be helpful to 
the reader.
Distinguishing between surface and deep Web sites can sometimes be tricky; many 
websites have both surface and deep (database) content. Furthermore, some sites 
have both free and pay areas. Additionally, many general sites contain IT 
information. To simplify categorization and to provide ease of use for the 
reader, the websites were placed into categories that would be most useful to 
the IT professional. For example, membership in a pay site like Educause is 
only open to organizations and their employees, not to individuals; however, 
most of Educause’s content is free to visitors, so Educause was placed under 
the Free Site category.
Free vs. Pay Sites
Can your get valuable information for free? Of course you can! The 
infrastructure of our society is such that many services are paid for in 
indirect ways – there is a “give and take” that benefits society as a whole. 
Public libraries are “free” to patrons. Public education is “free” to students. 
However, these are paid for by taxes and donations. What people learn at “free” 
resources like these can ultimately give rise to creative endeavors that 
advance our society and improve the lives of citizens far beyond the price of 
the initial investment.
There are many “free” sources on the Web that follow this same spirit. While 
there are many free sources of excellent information, fee-based information 
sources are worth considering using a cost-benefit approach in their 
evaluation. How much is your time worth? How much time is saved and how 
valuable is the information to you? Each person needs to decide this for their 
self. Knowlesys is the best of deep web search.
 
 
This message posted from opensolaris.org
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org
[osol-discuss] The value of deep Web search

Reply via email to