MIT to create digital library
Aims to save scholars' output

By Peter J. Howe, Globe Staff, 11/4/2002

CAMBRIDGE - Never shy about pursuing epic concepts, the Massachusetts
Institute of Technology today is formally launching one of its boldest
projects in years: an effort to create a long-term ''digital library''
encompassing virtually the entire intellectual output of MIT scholars and
researchers.

Called DSpace, the joint venture between MIT and technology giant
Hewlett-Packard Co. is aiming to create a ''superarchive'' to save trillions
of bytes' worth of digital information. It will cover everything from
recordings of classroom lectures and experiments to brain scans, ocean-floor
surveys, and monitoring of interstellar space.

DSpace aims to solve the digital era's version of a problem that is plaguing
conventional libraries holding troves of intellectual content stored in
formats, such as Dictabelt recordings, 5-inch floppy disks, and rotting
newsprint, that face the risk of becoming unusable by future generations.

MIT is hoping to lead a ''federation'' of universities around the world that
would build systems using the DSpace technology that would make scholarly
information available to any Internet-connected computer in the world.

At least eight other schools are expected to join DSpace by next September,
including Cambridge University in England, Columbia, and the universities of
Rochester, Toronto, and Washington. MIT is also working to make the system
interconnect with a similar effort at California's university system and
Ohio State.

MIT president Charles M. Vest said DSpace aims to ''set the new standard for
the stewardship of knowledge in the research environment.''

Ann Wolpert, director of MIT's libraries, which already have more than 5
million conventional volumes including books and papers, said MIT officials
have been working on the system since 1998.

More and more of professors' and researchers' intellectual work is now
''born digitally'' rather than as easily cataloged or scanned papers,
Wolpert said.

As the library began to get an increasing number of requests to archive
digital files like videos and huge research ''data sets,'' Wolpert said,
''We realized this was probably the tip of the iceberg. We thought if we
want to be a library of the 21st century, we'd better start cracking.''

Hewlett-Packard provided a $1.8 million grant to launch the project, which
could create millions of dollars in new business for the company if other
universities follow suit.

MIT expects to spend about $250,000 annually on maintaining and operating
DSpace, which would include a Google-like search engine enabling visitors to
search for information using content ''tags'' identifying files. The project
is using freely available ''open source'' software to make it possible for
other universities and organizations to join.

MacKenzie Smith, the DSpace project director, said about 1,000 items
totaling over 2 terabytes of data have been archived already - comparable to
the hard-disk memory of 200 high-end personal computers. In time, MIT
expects to be saving petabytes of data, or thousands of terabytes.

All of the words in every book in the Library of Congress, excluding
pictures, are often described as being equivalent to 20 terabytes, an
indication of the enormous scale of the MIT project. In conducting a survey
of potential demand for digital storage, Smith said, she found some MIT
researchers own ''data sets'' totaling 30 terabytes.

Hal Abelson, an MIT computer science and electrical engineering professor
who is helping lead the effort, said: ''I think the problem that libraries
are going to have is what they don't put into it'' given the potentially
limitless demand for storage capacity.

As MIT moves to put more and more course material online and available
across the Net, Abelson said DSpace will prove to be a crucial way to
provide an archive of each year's course material and course Web sites.
About 50 MIT classes now make their course materials available on line, with
another 150 expected to come online next year and all 1,500 to 2,000 yearly
courses by 2008 or 2009.

A public symposium to launch DSpace is being held this morning from 8:30
a.m. to 12:30 p.m. at MIT's Bartos Theatre, room E15-070.

Peter J. Howe can be reached at [EMAIL PROTECTED]

Reply via email to