I've looked at this question of the potential scholarly uses of Napster like systems. Please find below for your interest an article I published on the subject in the 13th April issue of Nature. Regards Declan Butler
13 April 2000 Nature 404, 694 (2000) © Macmillan Publishers Ltd. Music software to come to genome aid? DECLAN BUTLER [PARIS] Student use of a new 'killer' Internet application that allows anyone connected to the web to share music files stored on the hard disk of their own computer has become so heavy that US campuses want to ban the software for fear that it will saturate their academic Internet connections. But some scientists are thinking of adopting the principles behind the so-called 'Napster' technology themselves. They believe these could herald a new era in distributed computing, and in particular solve the thorny problem of how the vast community of biologists can collaborate on assigning functions to the genes in the human genome. Anyone connected to the Napster software, which can be downloaded from the Internet (http://www.napster.com/), can do a single search for a song across all the hard disks of other Napster users and download it directly from the user's computer. When Lincoln Stein, a bioinformaticist at the Cold Spring Harbor Laboratory in New York, heard about Napster, he was struck by the parallels with his own work on writing software for a distributed sequence annotation system for the human genome (see http://stein.cshl.org/das/). Napster, he realized, can be used to find and distribute information located anywhere on the Internet. Stein believes that annotation, which involves predicting which sequence stretches are genes and what their function might be, calls for a radically different global database structure from the centralized system used for gene- and protein-sequence data. At present, users submit and retrieve records to a few central databases, such as GenBank. But many scientists believe that annotation is too large a task for a few large genome centres. It is more of an art than a science, and up to half the predictions are wrong. No single group is likely to be able to produce a definitive version. Annotation centres could in principle contribute data to GenBank-like centres. But GenBank entries, which can be modified only by those who submitted them, are sometimes erroneous. The problem is likely to be worse with the more subjective annotation data; the creation of a single authorative annotated sequence seems unlikely. A better solution, Stein argues, might be to allow biologists worldwide to annotate the human genome sequence interactively using diverse computational and experimental methods, much as developers worldwide debug open-source software. But until now this sort of decentralized solution raised the spectre of duplication of effort, and the risk that scientists, instead of being able to consult a single central database, would have to search a series of separate versions of human genome databases. Data integration would become a real problem. Stein believes Napster-like technology could be the answer. A centralized reference server holding a detailed genome map would act as an anchor for data produced locally by third-party annotation servers. Researchers could publish their data electronically without having to maintain their own websites. Such a system could avoid the need to label entries with subjective identifiers, such as gene names or accession numbers, as occurs now. Instead, maps, gene predictions and functional activities could all be superimposed on the reference map using a system of coordinates, much as astronomers combine multiple-wavelength data with positional or coordinate information. A user could zoom in on any part of the genome and view the region in many ways by calling up related data from the hard disks of participating laboratories. Ewan Birney, joint head of Ensembl, a joint venture between the European Bioinformatics Institute and the Sanger Centre in Cambridge to develop an automatic annotation on eukaryotic genomes, is backing Stein's idea, and has proposed that Ensembl be used as the reference map. The US National Center for Biotechnology Information, however, apparently opposes the idea on the grounds that it would lead to a proliferation of junk. That risk is real. Although few genome scientists admit it publicly, the quality control of many smaller laboratories does not match that of specialized centres. "It's a Catch 22," says Birney. "We want to democratize people's ability to present their work, but quality is a problem." He believes one answer would be to offer users a full interactive choice between an approved annotation, composed of data from recognized gold-standard laboratories, and the full data. Few researchers are willing to publicly endorse Napster-like technology. The idea of leaving desktop hard disks open to the Internet is a network manager's nightmare, and is not helped by Napster's ability to scan firewalls and breach weaknesses. But Birney insists that developing the system should be taken seriously. "If we don't get it right it will be the difference between a biological web and people just continuing to use the Internet to send e-mails and read web pages." ---------------------------------------------------------------------------- ---- Nature © Macmillan Publishers Ltd 2000 Registered No. 785998 England. ----- Original Message ----- From: Stevan Harnad <har...@coglit.ecs.soton.ac.uk> To: <american-scientist-open-access-fo...@listserver.sigmaxi.org> Sent: Sunday, May 21, 2000 12:01 PM Subject: Re: Napster: stealing another's vs. giving away one's own On Sat, 20 May 2000, Joseph Ransdell wrote: jr> Though what has been said about Napster is certainly relevant, I don't jr> think the import of it for self-archiving of one's professional work, jr> published or pre-print, has quite come into focus for us here. Let us jr> leave aside the use of it to pirate music, which is a red herring jr> relative to the concerns of this forum. It is not a red herring in one essential respect: There are many people who currently oppose open archiving of refereed research because they think it is a form of theft: There are university administrators who think this (feeling the pressure of the serials crisis, but understandably not wishing to relieve it illegally); there are librarians who think this; there are publishers who think this; and there are authors who think this. The primary motivation and use of Napster for consumer-end piracy simply reinforces this false impression, which is still holding us all back from the optimal and the inevitable for research. It is for this reason that it is so important to make it clear that author self-archiving is NOT a form of consumer-end piracy at all; it is a producer-end give-away, and as such, it does not need Napster-like tricks for distribution. It can and should be done perfectly up-front and legally by authors on the Web itself. No need for "second economy" bootleg links between users' PC's: Just proudly self-archive your own refereed work on your own institutional Open Archive or a central one. Interoperability will take care of the rest; and consumers will be able to get your give-away product perfectly legally, and without the need of any "second network." http://www.openarchives.org/ http://www.eprints.org/ http://www.dlib.org/dlib/december99/12harnad.html Professor Randsell goes on to make further suggestions for Napster-style distribution, again failing to take the difference between consumer-end rip-off and producer-end give-away into account. For when it is producer-end give-away, there is no need for a "second network" or directly connected computers (with all the attendant needless risks and vulnerabilities). The good old WWW will do fine. jr> What makes [Napster] relevant here is its potentialities as a jr> communications technology that can be used to defeat reactionary jr> intellectual property practices. Via consumer rip-off or producer give-away? Is there any reason whatsoever that the latter should make common cause with the former? The Net was built in the spirit of shareware, but now that the entire economy is moving onto the Net, it is just as absurd that the Net should (quixotically, and chaotically) try to impose the give-away model on all of Trade, as that the Trade model should now be imposed on all of the Net. Let 1000 flowers bloom. The teenage and post-teenage hackers who craft the likes of "Gnutella" in the hopes of freeing the Golden Goose-Eggs (about whose exact provenance they are blissfully murky) for one and all, Napster-style, would simply kill the Golden Goose if they prevailed unchecked. This is a classical "evolutionarily unstable strategy," in which cheaters eventually deplete the resource they exploit. There is no reason whatsoever to link the rational, right, and reachable goal of freeing the refereed research literature to this sort of murky myopia in any way. http://www.ecs.soton.ac.uk/~harnad/Hypermail/Cognition.Sociobiology.98/00 02.html jr> one advantage it offers that is not accommodated by the public archives jr> in process of construction at present is that one can make publicly jr> available many different kinds of resource material in addition to jr> scholarly or scientific research reports proper... jr> it could make easily available scholarly and investigative tools of jr> the sort which heretofore have always perished with those individuals jr> who devised them. Are we talking about consumer rip-off or producer give-away? If the latter, what's wrong with doing it publicly on the Web? (And the Open Archive interoperability can and will easily be extended to other forms of give-away too, not just research reports.) jr> Would people actually be willing to share their research instruments jr> and materials in that way[?] Reasonable question. Where the answer is "Yes," the course is clear (and does not require a second, Napster-style Net: the first one will do). Where the answer is "No," we are talking about theft rather than give-away (and most of us will want to just walk-away from that). jr> this Napster-like technology could yield a distributed archival jr> database which could easily grow... [but would] have to remain distinct jr> from the database of e-prints currently envisaged because of its highly jr> fluid character, owing to its dependence on the willingness of jr> individuals not only to keep on making the materials available but also jr> to follow routine practices in revision of their work and in the jr> development of their personal instruments of research. It is not at all clear why all of this (if it's legal give-aways) cannot be done within the Open Archives framework. jr> the value of it relative to the aims of the present forum could only jr> lie in its side-effect of tending to encourage self-archiving of the jr> stable sort wanted here. On the contrary, any association with Napster-style consumer fraud can only have the side-effect of retarding open archiving's entirely ethical mandate. jr> To use one of Stevan's favorite metaphors, if the horses, being shown jr> the water, continue to be reluctant to drink, it could be because of jr> inhibitions that can only be addressed in other ways than those that jr> suggest themselves when one thinks of the problem of open publication jr> only in the simplistic and highly abstract way it is usually described jr> here. On the contrary. Inhibitions about self-archiving are based on the unfounded fear that it may be wrong or illegal; gratuitously linking it to something that may indeed be wrong and illegal hardly helps. The algorithm is indeed simple, but hardly abstract: Researchers' refereed research reports are give-aways; researchers should accordingly self-archive them online, free for all. No need for a Napster-style "second economy" to do this: Open archiving will do it for you (and for any other research-related things you may wish to give away too). http://www.arl.org/sc/subversive/ -------------------------------------------------------------------- Stevan Harnad har...@cogsci.soton.ac.uk Professor of Cognitive Science har...@princeton.edu Department of Electronics and phone: +44 23-80 592-582 Computer Science fax: +44 23-80 592-865 University of Southampton http://www.ecs.soton.ac.uk/~harnad/ Highfield, Southampton http://www.princeton.edu/~harnad/ SO17 1BJ UNITED KINGDOM NOTE: A complete archive of this ongoing discussion of providing free access to the refereed journal literature is available at the American Scientist September Forum (98 & 99 & 00): http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html You may join the list at the site above. Discussion can be posted to: american-scientist-open-access-fo...@amsci.org