Q. In your opinion, is the Internet more like a newspaper, or more like a library?
R. Let us compare the way information is found. You might have found this article by browsing through the newspaper or you might have found it by first referring to the 'Inside' column on page 1. When printing the newspaper, the publisher does several things to help readers find information. On the front page is the name of the paper in large, bold letters along with the date of the paper. This tells the reader what paper is being read and the date of news contained within.
Additionally, on the front page is a small column entitled 'Inside'. 'Inside' serves as a table of contents for the paper. Using 'Inside', you can find the page numbers for Business, Opinions, Sports, etc.
In this manner, the newspaper publisher has done several things to assist the reader to quickly and efficiently find information of interest.
The Internet is more like a library than a newspaper because the Internet is a collection of publications, referred to as 'home pages' or 'Internet sites', whereas the newspaper is one publication. Unlike the newspaper, reader assistance on the Internet is not nearly as well developed.
Q. How is "going to the library" different from "going to the Internet"? Who or what provides "reader assistance"?
R. The Internet is like a library building with no card file and no defined order to the placement of books or periodicals.
In such a library, a searcher would need to look at every publication, then read every table of contents to see whether the library contained information relevant to the search. Such a search would be prohibitively time consuming.
Fortunately, there is a lot of work underway to assist readers in finding information on the Internet. As the Internet has grown rapidly over the past 5 to 10 years, many people saw the need for assistance in finding information and have developed 'search engines' to help find information across the web. Several well known search engines are available to Internet users. Addresses of many of these search engines are included in the 'bookmark' section of Internet browser software.
For example, the Yahoo site, http://www.yahoo.com, contains a search engine that will help a searcher locate information. In addition to the search engine, the Yahoo site has divided information into subject categories that provide to the searcher a type of 'card catalog' to the Internet. Unfortunately, this site, as all search sites, do not reach everything on the Internet. Additionally, when used, the search engine frequently returns a group of documents too large to review in a reasonable time. For example, I used the Altavista search engine (www.altavista.digital.com) and searched for 'employment'. The search engine returned nearly 556,000 documents. Obviously, this is more documents than I would want to read if looking for employment.
Q. What is being done to improve Internet research efficiency?
R. Managers of search sites are working to reduce the volume of documents found to only those that contain relevant information; but, there is still a lot of work to do.
In addition to work on improved search engines, developers are working on 'agent software', 'push technology' software, and other types of tools to help provide relevant information to a researcher. But, more important than the work on improved tools is the work that is being done to better classify and categorize the information when it is published on the Internet so search software will more effectively locate relevant information. This is not new work since librarians have always performed this function for hard copy publications. Library science includes the study and application of methods to group information into categories, or subjects, that are meaningful to people searching for information. Common methods include the Dewey Decimal and the Library of Congress catalog schemes.
A search in the library begins at the card catalog (quickly going to electronic catalogs) which is subdivided into subjects based on the catalog scheme being used. From the catalog, the searcher can determine the physical location of the publication within the library. Similarly, a standardized scheme for categorizing information published on the Internet will help software locate relevant information and limit it to that specified by the searcher. To make the Internet more like a worldwide library, a standardized electronic 'card catalog' is required.
One effort to develop a standard catalog for publishing on the Internet is commonly referred to as the Dublin Core Workshop. "The Dublin Core Workshop Series is an ongoing effort to form an international consensus on . . . a simple description record for networked resources. It is expected that a simple and widely-understood set of elements will promote . . . and improve resource discovery on the Internet." This quote is from the report of "The 4th Dublin Core Metadata Workshop Report, DC-4, March 3 - 5, 1997, National Library of Australia, Canberra". The full report can be found on the Internet at: http://www.oclc.org:5046/~weibel/dc4.html.
This effort, or some similar effort, will eventually result in a standard by which all information on the Internet will be published; thus making research over the Internet more effective and less time consuming. In the meantime, for effective Internet research, you can use on-line library catalogs in addition to available search engines.
Q. What on-line library catalogs would you recommend?
R. One place to start is the Southern Methodist University library: http://www.smu.edu/~cul/resources.html. This site contains links to many libraries with on line catalogs including the US Library of Congress: http://marvel.loc.gov/homepage/lchp.html.
A site that uses a library approach to cataloging information on the Internet is: http://vlib.stanford.edu/Overview.html.
Another site that contains links to many libraries is: http://sunsite.Berkeley.EDU/Libweb/.
Eventually, when information on the Internet is published using
a standard catalog scheme, searching for information on the Internet
will be as easy as searching for relevant articles in your library
or in your newspaper.
Gil Merkle is the Chief Information Systems Officer at Texas Lutheran
University(TLU) where he is responsible for acquiring and operating
all computer, network, and telephone systems for the University.
He recently came to TLU from USAA in San Antonio where he had
spent over 15 years in information systems. A short biography
is at
http://lonestar.texas.net/~merkleg and he can be reached at merkle_g@txlutheran.edu.
To contribute to this column, contact column coordinator
Gloria R. Rivera
riverag@connecti.com
http://www.seguin.net
http://www.seguin.net/corp/swd
303-4764