Re: Central vs. Distributed Archives

2003-11-15 Thread Stevan Harnad
>   PubMed Central will host individual OA articles
>
>   PubMed Central http://www.pubmedcentral.gov/index.html
>   has launched an About Open Access page
>   http://www.pubmedcentral.gov/about/openaccess.html drawing attention
>   to the journals that provide open access to their contents through
>   PMC. The page also announces an important new policy: "[I]n October
>   2003, PMC began accepting individual open access articles from
>   journals that do not participate in PMC on a routine basis. For
>   the specific conditions under which PMC accepts these articles,
>   see the relevant PMC agreement (in Microsoft Word format)
>   http://www.pubmedcentral.gov/pmcdoc/pmc-openaccs-agree.doc
>   ." The offer is open to all authors in the life sciences
>   willing to release their work to "open access" as
>   defined by the Bethesda Statement on Open Access Publishing
>   http://www.earlham.edu/~peters/fos/bethesda.htm. (Thanks to George
>   Porter.) Posted to Open Access News 12 November 2003 by Peter Suber
>http://www.earlham.edu/~peters/fos/2003_11_09_fosblogarchive.html#a106866889488739033

Relevant Prior Subject Threads:

"E-Biomed: Very important NIH Proposal"
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0240.html
http://www.nih.gov/about/director/ebiomed/com0509.htm

"NIH's Public Archive for the Refereed Literature: PUBMED CENTRAL"
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0372.html

Just two comments:

(1) More central open-access archives in which authors can self-archive
their articles are always welcome and helpful (especially if they are
OAI-interoperable) and it is gratifying to see what was originally the
E-Biomed proposal -- which at first unfortunately backed away from
individual author self-archiving of toll-access journal articles --
now ready to accept author self-archiving at last!

It has to be added, though, that since 1999, with the advent
of distributed eprint archiving, integrated by the glue of
OAI-interoperability http://www.openarchives.org/ , it has become
apparent that institutional self-archiving is a more promising route
than central self-archiving, because researchers and their instutions
share the benefits of maximizing the impact of their own research output,
and share the costs of impact-loss because of toll-based access-denial
to would-be users everywhere. Institutions also wield the carrot/stick
of "publish or perish" over their own researchers and are hence
in the position to mandate and monitor compliance with their own
self-archiving policy. Central archives share no such common costs/benefits
with researchers, and are not in a position to mandate self-archiving
or to monitor compliance.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0043.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0023.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0044.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0005.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0006.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0013.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0015.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0016.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0018.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0022.gif

(2) The Bethesda statement on open access publishing
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2878.html
is indeed a statement on open-access *publishing* and not on *open access,*
i.e., only on the golden and not the green (self-archiving) road to open access.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/3147.html

It is a potentially useful document, but only if this one-sidedness
is conscientiously and decisively remedied, for as it stands, the
Bethesda Statement is simply missing out on 95% of the immediate
potential for open access. (In addition, the Bethesda definition of
"open" is over-determined, again because of its one-sided focus on
open-access journal publishingalone. All that research
and researchers need is free online full-text access to
all research; the rest comes automatically with the online
territory: See the subject-thread: "Free Access vs. Open Access"
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2956.html )

http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0021.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0024.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0026.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0027.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0028.gif
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving_files/Slide0029.gif

Stevan Harnad

NOTE: Complete archive of the ongoing discussi

Re: Central vs. Distributed Archives

2003-11-06 Thread Eberhard R. Hilf
I agree with Stevan: ArXiv just needs a note clarifying that it is only
a time stamp and archiving machine, and takes no legal responsibility
for its content because it does not 'read the content' (as referees
do). It acts as a gateway provider. So the risk stays with the author.

Within-arxiv plagiarism can easily be checked within the
arxiv. Plagiarized papers will have a later time stamp, and thus the
original author can be spotted and the later one(s) blamed.

In contrast, scientific journals, serving to 'read and referee and check
the content of the paper' and gaining the ownership are responsible in
case the paper turns out to be plagiarized.

So, journal publishers run a real legal risk, in that they do not check
for plagiarism, - and they have to check this across all journals of all
publishers, since they claimed it's new.

The Schoen case and many others confirm: plagiarism in the e-age is a
real and formidable because it is so easy to-do. Plagiarism only seemed
to be rare, because it was not checked by the journals.

An still wider spread abuse is self-plagiarism, copy-and-pasting from
one's own older papers. Easy, 'legal', but a piece of misconduct by the
author from the standpoint of the reader.

http://www.iupap.org lists the recent London conference on plagiarism,
misconduct of authors, referees, journal editors.

Ebs

.
Eberhard R. Hilf, Dr. Prof.;
CEO (Geschaeftsfuehrer)
Institute for Science Networking Oldenburg GmbH
an der Carl von Ossietzky Universitaet
Ammerlaender Heerstr.121; D-26129 Oldenburg
ISN-home: http://www.isn-oldenburg.de/
homepage: http://isn-oldenburg.de/~hilf
email   : h...@isn-oldenburg.de
tel : +49-441-798-2884
fax : +49-441-798-5851

On Thu, 6 Nov 2003, Stevan Harnad wrote:

> Yet another piece of evidence has appeared that seems to confirm that
> whereas central archiving was historically the way in which self-archiving
> began, it is not the fastest or best form for it to grow and spread today:
>
> The Nature headline is (as usual for the press) an exaggeration:
>
> "Critical comments threaten to open libel floodgate for physics archive"
>
http://www.nature.com/cgi-taf/Dynapage.taf?file=/nature/journal/v426/n6962/full/426007b_fs.html
>
> "Legal concerns plague open access physics archive"
> http://www.scidev.net/news/index.cfm?fuseaction=readnews&itemid=1087&language=1
>
> but the facts seem to be that, across the years, some papers that
> contained plagiarism or libel might have found their way into ArXiv's vast
> (250,000 papers) and unvetted collection.  http://www.arxiv.org
>
> I said "unvetted," but of course almost all those papers are
> also submitted to peer-reviewed journals, which *do* vet them,
> and when there have been any corrections to the unrefereed
> preprint, the authors self-archive the refereed postprint too:
> http://opcit.eprints.org/tdb198/opcit/
>
> So the (tiny) problem of plagiarism and libel is with papers that have
> *not* been peer-reviewed.
>
> ArXiv can make an effort to vet its daily submissions for plagiarism or
> libel, but at nearly 4000 per month, this would be quite a task:
> http://arxiv.org/show_monthly_submissions
>
> So the natural conclusions to draw from this seem to be the following:
>
> (1) OAI-interoperability has now made all OAI-compliant archives
> equivalent: They can all be harvested and jointly searched. It no
> longer makes any difference which archive a paper is actually deposited
> in: http://oaister.umdl.umich.edu/o/oaister/
>
> (2) Not only are institutions in the best position to vet their own
> research output before approving deposits in their own institutional
> archives (probably on a departmental basis, optimally)
> http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
> but this vetting load is much better shouldered in a distributed way,
> rather than having one centralized vettor for all of the planet's research
> output (in physics, mathematics, or other disciplines).
>
> (3) Having institutional self-archived research output housed in the
> institution's own archives also immunizes the archive from external
> liabilities (such as plagiarizers from other institutions) but it also
> makes it even more clear that -- contrary to what the Nature article
> says it is, and perhaps contrary even to what the Physics ArXiv *thinks*
> it is -- open-access archives are not *publishers*! They are merely a
> means of providing open access to (refereed) publications (as well as
> to their precursor unrefereed preprints).
>
> "Garfield: 'Acknowledged Self-Archiving is Not Prior Publication'"
> http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2239.html
>
> For those who needed a reminder of it, research's "publish or perish"
> mandate is *not* "self-archive or perish"! "Publication" refers to
> certification as having met the known peer-review quality standards of
> a journal, not to having pressed the click button to self-archive an
> unrefe

Re: Central vs. Distributed Archives

2003-11-06 Thread Stevan Harnad
Yet another piece of evidence has appeared that seems to confirm that
whereas central archiving was historically the way in which self-archiving
began, it is not the fastest or best form for it to grow and spread today:

The Nature headline is (as usual for the press) an exaggeration:

"Critical comments threaten to open libel floodgate for physics
archive"

http://www.nature.com/cgi-taf/Dynapage.taf?file=/nature/journal/v426/n6962/full/426007b_fs.html

and so is SciDevNet's:

"Legal concerns plague open access physics archive"
http://www.scidev.net/news/index.cfm?fuseaction=readnews&itemid=1087&language=1

but the facts seem to be that, across the years, some papers that
contained plagiarism or libel might have found their way into ArXiv's vast
(250,000 papers) and unvetted collection.  http://www.arxiv.org

I said "unvetted," but of course almost all those papers are
also submitted to peer-reviewed journals, which *do* vet them,
and when there have been any corrections to the unrefereed
preprint, the authors self-archive the refereed postprint too:
http://opcit.eprints.org/tdb198/opcit/

So the (tiny) problem of plagiarism and libel is with papers that have
*not* been peer-reviewed.

ArXiv can make an effort to vet its daily submissions for plagiarism or
libel, but at nearly 4000 per month, this would be quite a task:
http://arxiv.org/show_monthly_submissions

So the natural conclusions to draw from this seem to be the following:

(1) OAI-interoperability has now made all OAI-compliant archives
equivalent: They can all be harvested and jointly searched. It no
longer makes any difference which archive a paper is actually deposited
in: http://oaister.umdl.umich.edu/o/oaister/

(2) Not only are institutions in the best position to vet their own
research output before approving deposits in their own institutional
archives (probably on a departmental basis, optimally)
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
but this vetting load is much better shouldered in a distributed way,
rather than having one centralized vettor for all of the planet's research
output (in physics, mathematics, or other disciplines).

(3) Having institutional self-archived research output housed in the
institution's own archives also immunizes the archive from external
liabilities (such as plagiarizers from other institutions) but it also
makes it even more clear that -- contrary to what the Nature article
says it is, and perhaps contrary even to what the Physics ArXiv *thinks*
it is -- open-access archives are not *publishers*! They are merely a
means of providing open access to (refereed) publications (as well as
to their precursor unrefereed preprints).

"Garfield: 'Acknowledged Self-Archiving is Not Prior Publication'"
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2239.html

For those who needed a reminder of it, research's "publish or perish"
mandate is *not* "self-archive or perish"! "Publication" refers to
certification as having met the known peer-review quality standards of
a journal, not to having pressed the click button to self-archive an
unrefereed draft in an open-access archive! That meets the (trivial)
legal definition of "publishing," to be sure -- even hand-writing it
on paper once and showing it to someone does! But it certainly doesn't
meet the definition of what the research community (and promotion/salary
committees, and research-funding councils) means by "publication,"
which is to be certified by a qualified, neutral third-party as having
met its known standards of peer review. At best, the self-archiving
of an unrefereed draft qualifies as vanity-press *self-publication* --
but that is precisely what researchers' institutions and their "publish
or perish" mandates are there in order to *protect* their researchers
from doing! (Or rather, to ensure that they go on to get their papers
properly peer-reviewed and certified as having met the peer-review
standards of the particular journal that accepted the paper.)

By the same token, it is each researcher's own institution -- not a
centralized entity like ArXiv -- that is in the best position to prevent
its own researchers (and themselves) from self-archiving plagiarized or
libellous papers -- and to take action if they do.

Having said that, the Physics ArXiv's "legal concerns" are all a tempest
in a teapot anyway. A central archive is a service provider. The service
it provides is to operate an archive for authors to self-archive in. If
an author self-archives a piece of plagiarism or libel therein, the only
legal responsibility of the archive is to *remove* that item as soon as
it is drawn to its attention. This is exactly the same rule as the one
applied to other Internet service providers: If someone posts or emails
pornography in an AOL discussion list or bulletin board, AOL does not
become liable as a pornographer if it immediately removes the item
as soon as it is drawn to its attention and blocks further postings
from th

Re: Central vs. Distributed Archives

2003-10-31 Thread Dr.Vinod Scaria
Stevan Harnad wrote:

> Just as it was counterproductive to villify toll-access publishers
> (instead of either founding open-access journals or self-archiving),
> so it is counterproductive to villify open-access publishers (instead
> of either founding competing open-acecss journals or self-archiving).

It is also counterproductive to ignore the authors from the developing
world who have been always kept away from the mainstream.
I am not against the "author pays " model, but just against the lack of
flexibility in operation.Majority of researchers in developing countries
have never had the luxury of being funded. [our own study (unpublished)
on authors publishing in top Indian Journals indexed in MEDLINE shows
more than 90% have had no funding for their research and those who had
it , had something like a miniscule fraction of what is considered as
*funding* in the developed countries]. This would simply mean they would
never be able to pay from their funds!.

There could be other viable models- like paying a fixed percentage of
funds for publishing. This would sound more aesthetic to researchers too.
This would also mean publishers could easily subsidize for research from
developing countries as well as researchers from Developed countries who
are not funded.

> So is the "monopolistic" objection that BMC and PLoS have more start-up
> support, giving them an advantage over journals without that support,
> or is the objection that they have an "author pays" model, unaffordable
> for some authors?

The heavy start up support gives them a clear edge over new
and existing publishers. PLoS Biology would not have received
the popularity and access [the traffic nearly broght down
their elegant homepage to just a couple of links on the day of
inauguration]. And the PLoS fund was better used to support lobbying --
http://bmj.bmjjournals.com/cgi/content/full/326/7392/766#art -- rather
than entering into neck-to-neck fight with existing publishers. If it was
really interested in supporting open Access, it should have supported
Journal of Biology, an Open Access Journal from BMC.

> And the same can be said about volunteer-service-based journals:
> It is too early to say whether they can last on volunteerism alone,
> let alone whether volunteerism can scale up to all 24,000 refereed
> journals!

Just imagine the scalability if the Internet was monopolised by
come company! The whole spectrum of resources we access with a
click was created by volunteerism, donations and public money.
Does PubMed/PubMedCentral make any profit?

> Perhaps a far better choice would have been to require all your authors
> to (1) try to self-archive their articles at their own institutions, and
> only in those cases where that failed, (2) to self-archive them in
> CogPrints or another suitable OAI-compliant archive. Offloading the
> self-archiving task onto the distributed authorship instead of the
> journal staff would take some of the load off the volunteer efforts
> (hence costs) involved!
>
> That policy would also have the benefit of spreading the practise of
> self-archiving by authors, as well as archive-provision by their
> institutions.

And yes! we actually plan to provide the authors with PDF reprints which
they could archive on their own. We did it ourselves just because we
need to see the whole thing gets started. We are also encouraging authors
to republish them on their institutional websites/repositories or their
own websites in addition to our existing archive at Cogprints.

> These are the vulnerabilities of new journals; they have nothing to do
> with open-access.

The sudden disappearance of a journal website would not have
made it so desparate if it was open access and someone would have
copied it somewhere [ some of the JMIR articles are available at
http://www.cybermedicine.netfirms.com [I own and maintain this site] after it
became open. I have also seen a number of similar websites offering JMIR
content]. This would mean one could access it just by searching for the
keywords on Google or any major search engine for that matter. At the same
time, that would not be the situation in a journal which is toll-access.

Dr. Vinod Scaria
http://www.drvinod.netfirms.com
MAIL:vinodsca...@yahoo.co.in
Tel: +91 98474 65452


Re: Central vs. Distributed Archives

2003-10-31 Thread Stevan Harnad
  "Trends in Self-Posting of Research Material Online by Academic Staff"
   Theo Andrew supplies a case study from the University of Edinburgh.
http://www.ariadne.ac.uk/issue37/andrew/

This is a survey preceding a series of SHERPA eprint self-archiving
projects http://www.sherpa.ac.uk/ to be implemented at Edinburgh.

"Prior to the implementation of these projects at the University of
Edinburgh, it was decided that a baseline survey of research material
already held on departmental and personal Web pages in the ed.ac.uk
domain"

The main conclusion of this advance survey was that:

(1) "an unexpectedly high volume of research material (over 1000
peer-reviewed journal articles) exists online in the ed.ac.uk domain"

and

(2) "there is a direct correlation between willingness to self-archive
and the [prior] existence of subject-based [non-Edinburgh]
repositories"

It is perhaps unsurprising that the Edinburgh disciplines that are the
most advanced in self-archiving are the ones that are also most advanced
globally, having their own central, discipline-based archives (elsewhere).
That said, 1000 is still a small number (relative to Edinburgh's annual
output), and now going on to establish departmental eprint archives at
Edinburgh will further promote self-archiving at Edinburgh, especially
if Edinburgh and the UK Research Funding Councils adopt a systematic
open-access policy along the lines of the Berlin Declaration:

http://www.ecs.soton.ac.uk/~harnad/Temp/berlin.htm
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.ariadne.ac.uk/issue35/harnad/

The article goes on to note:

"The big problem is that this material is widely dispersed and
therefore not easily found. This is not very useful for the wider
dissemination of scholarly work. Also, personal Web sites tend to
be ephemeral..."

This refers to the 1000 articles self-archived at Edinburgh *before* the
forthcoming Edinburgh eprint archives are implemented. The upcoming
archives will presumably be OAI-compliant -- http://www.openarchives.org
-- thereby solving the problem of dispersal and interoperability that
besets arbitrary websites.

As these self-archived articles will be duplicates of the published
version, self-archived in order to provide immediate open access, the
primary preservation problem will not be theirs; it will be the problem of
the producers and purchasers of the publishers' proprietary version. The
self-archived versions in the Physics ArXiv, for example, have
lasted twelve years now, and been successfully retrofitted for
OAI-compliance. There is every reason to belief that the growth of
self-archived content itself will be the best guarantor that we will
see for its perennity.

Oddly, there is no reference in this article to Edinburgh's own
most important existing eprint archive, already OAI-compliant,
and containing 10% of Edinburgh's current self-archived articles:
http://archive.ling.ed.ac.uk/ (There seems to be some confusion
of its contents with those of a non-Edinburgh archive --
http://cogprints.ecs.soton.ac.uk/ -- which overlaps with it in subject
matter).

There is also no reference to any prior usage surveys, such as:
http://www.eprints.org/results/
http://opcit.eprints.org/opcitevaluation.shtml

It is unfortunate that the title refers to "self-posting" whereas the
more widely used term "self-archiving" throughout the text itself: Why
proliferate needless and confusing synonyms? [The title may have been been
an unwise editorial suggestion that the author should have declined!])

Stevan Harnad

NOTE: Complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):

http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html
Posted discussion to: american-scientist-open-access-fo...@amsci.org

Dual Open-Access Strategy:
BOAI-2: Publish your article in a suitable open-access journal
whenever one exists.
BOAI-1: Otherwise, publish your article in a suitable toll-access
journal and also self-archive it.
http://www.soros.org/openaccess/read.shtml
http://www.eprints.org/signup/sign.php


Re: Central vs. Distributed Archives

2003-10-30 Thread Michael Eisen
I would like you to defend your claim that PLoS is "crunching" small
publishers. Can you provide an example?

- Original Message -
From: "Dr. Vinod Scaria" 
To: 
Sent: Thursday, October 30, 2003 9:07 AM
Subject: Re: Central vs. Distributed Archives


> CALICUT MEDICAL JOURNAL
>  http://www.calicutmedicaljournal.org
> ARCHIVES AT COGPRINTS
> ***
>
> As we all know, Open Access Publishing is not gaining the momentum as
> far as Journals published from Developing Countries are concerned [with
> reference to western Journals]. Many reasons can be attributed like:
>
> 1. Monopolistic nature of Open Access Publishers like BioMedCentral
> http://www. biomedcentral.com which pursues the "author pays"
> and would drive away any author from Developing countries. Thus
> obviously publishers from Developing countries would have second
> thoughts before starting one at BMC.
>
> By meaning monopolistic, I refer to the almost complete control over open
> access publishing- say about >75% of open Access Journals in Medicine.and
> Mega organisations like PLOS are crunching the small publishers, as they
> can easily override the smaller ones with the mega funding they have.
> see: http://bmj.bmjjournals.com/cgi/content/full/326/7392/766#art
>
> 2. As I previously stated in my Editorial in Internet Health-
> www. virtualmed. netfirms. com/internethealth/articleapril03. html ,
> the fear of losing revenue, which are the sole source of sustenance
> of many Journals [though some make a meagre profit].
>
> 3. Lack of sufficient expertise and
> exposure to Open Access Publishing. >>
> www. virtualmed. netfirms. com/internethealth/opinion0303. html
> http://bmj. com/cgi/eletters/326/7382/182/b <<
>
> But recent developments are worth mentioning - at least from India. Online
> Journal of Health and Allied Sciences www. ojhas. org , India's first
> Online BioMedical journal declared a couple of months back that they
> would go Open.
>
> [I am in the Editorial board of OJHAS from Sept 2003]. OJHAS is
> edited and published by a small group of scholars with no external
> support. Everything from Web Design to Editing and Review are done by
> voluntarily by the Editorial team. It also stands as a fine example of
> the fact that Open Access Journals can indeed be successfully organised
> and can indeed survive without an "author pays" model.
>
> Now coming to the Archival, Cogprints was our first choice for many
reasons
>
> 1] It offers interoperability [as mentioned by Harnad]
> 2] It offers unmatched popularity
> 3] It has been there for years and we can be sure of the permanence
> 4] It is of course FREE.
>
> And as Harnad suggested, there is no reason why Journals should not
> be archived at Open Archives, be it self maintained repositories or
> Centralised ones. In fact Open Archiving of electronic journals is
> the need of the hour because our own studies [unpublished] show that
> Electronic journals are just as ephemeral as websites. Scholarly
> communication should never be lost at the cost of copyright
> restrictions. Many of these journals have perhaps done more harm than
> good by locking the access by copyright restrictions.
>
> Moreover, electronic journals are equally vulnerable to the vagaries
> of the Internet. For example, JMIR www. jmir. org went suddenly offline
> some time back [i think it was an year or so] making the whole content
> inaccessible. [But it reappeared later and now is an Open Access Journal].
>
> Thus in short, OPen Archiving of Journals as a whole is perhaps to be
> discussed in a wider perspective than just making it OPEN. The major
> emphasis should be the PERMANENCE of Open Archiving. I hope this post will
> surely trigger a debate on the topic.
>
> Kind regards
>
> Dr. Vinod Scaria
> Executive Editor: Calicut Medical Journal
> Assoc Editor: Online Journal of Health and Allied Sciences
> Editor in Chief: Internet He@ lth
>
> WEB: www. drvinod. netfirms. com
> MAIL: vinodscaria@yahoo. co. in
> Mobile: +91 98474 65452
>
> - Original Message -
> From: Stevan Harnad
> To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM@LISTSERVER. SIGMAXI. ORG
> Sent: Wednesday, October 29, 2003 3:38 AM
> Subject: Re: Central vs. Distributed Archives
>
> The two items that follow below are by Vinod Scario from Peter Suber's
> Open Access News http://www. earlham. edu/~peters/fos/fosblog. html
>
> It provides an interesting and inspiring example of the power
> and value of OAI-interoperability http://www. openarchives. org/
> and the interdependence of the two op

Re: Central vs. Distributed Archives

2003-10-30 Thread Stevan Harnad
But recent developments are worth mentioning - at least from India. Online
> Journal of Health and Allied Sciences www.ojhas.org , India's first
> Online BioMedical journal declared a couple of months back that they
> would go Open.
>
> [I am in the Editorial board of OJHAS from Sept 2003]. OJHAS is
> edited and published by a small group of scholars with no external
> support. Everything from Web Design to Editing and Review are done by
> voluntarily by the Editorial team. It also stands as a fine example of
> the fact that Open Access Journals can indeed be successfully organised
> and can indeed survive without an "author pays" model.

There exists no firm evidence at all at the moment as to whether or
not Open Access journals can survive, with or without an "author pays"
model. Subsidized journals are subsidized journals, and depend on the
survival of the subsidy, not the journal. "Author pays" journals have
been around for far too short a time for us to know whether they can
survive. And the same can be said about volunteer-service-based journals:
It is too early to say whether they can last on volunteerism alone,
let alone whether volunteerism can scale up to all 24,000 refereed
journals!

> Now coming to the Archival, Cogprints was our first choice for many reasons
>
> 1] It offers interoperability [as mentioned by Harnad]
> 2] It offers unmatched popularity
> 3] It has been there for years and we can be sure of the permanence
> 4] It is of course FREE.

Perhaps a far better choice would have been to require all your authors
to (1) try to self-archive their articles at their own institutions, and
only in those cases where that failed, (2) to self-archive them in
CogPrints or another suitable OAI-compliant archive. Offloading the
self-archiving task onto the distributed authorship instead of the
journal staff would take some of the load off the volunteer efforts
(hence costs) involved!

That policy would also have the benefit of spreading the practise of
self-archiving by authors, as well as archive-provision by their
institutions.

> And as Harnad suggested, there is no reason why Journals should not
> be archived at Open Archives, be it self maintained repositories or
> Centralised ones. In fact Open Archiving of electronic journals is
> the need of the hour because our own studies [unpublished] show that
> Electronic journals are just as ephemeral as websites. Scholarly
> communication should never be lost at the cost of copyright
> restrictions. Many of these journals have perhaps done more harm than
> good by locking the access by copyright restrictions.

This is too vague: For toll-access journals, the preservation burden for
their contents (both the paper version and the online version) is
squarely on the shoulders of the journals that sell them and the
libraries that buy them. The self-archived versions of toll-access
journal articles are merely *duplicates,* provided for access, and it is
a strategic mistake to make an issue of concerns about their long-term
preservation. Those duplicates have lasted over 12 years already and
they will continue to last long enough to be retrofitted with whatever
solution the open-access era may eventually generate, if/when it prevails.

But the fact that new journals (whether paper or online) come and go is
a different problem. Journals should be archival in the sense that they
continue to exist. If they just make an appearance for a few months or
years and then vanish, then they are merely scattered collections of
items, and the preservation of such orphan items is a problem independent
of the problem of open access.

> Moreover, electronic journals are equally vulnerable to the vagaries
> of the Internet. For example, JMIR www. jmir. org went suddenly offline
> some time back [i think it was an year or so] making the whole content
> inaccessible. [But it reappeared later and now is an Open Access Journal].

These are the vulnerabilities of new journals; they have nothing to do
with open-access.

> Thus in short, OPen Archiving of Journals as a whole is perhaps to be
> discussed in a wider perspective than just making it OPEN. The major
> emphasis should be the PERMANENCE of Open Archiving. I hope this post will
> surely trigger a debate on the topic.

Preservation and access are -- for the time being -- very different
matters. The pressing problem for authors of the toll-access literature
today is access-denial and impact-loss, not preservation. It is a
mistake to conflate the open access problem with the digital preservation
problem, and it helps neither open access nor digital preservation.

Stevan Harnad

> Kind regards
>
> Dr. Vinod Scaria
> Executive Editor: Calicut Medical Journal
> Assoc Editor: Online Journal of Health and Allied Sciences
> Editor in Chief: Internet He@ lth
>
> WEB: www. drvinod. n

Re: Central vs. Distributed Archives

2003-10-30 Thread Dr. Vinod Scaria
CALICUT MEDICAL JOURNAL
 http://www.calicutmedicaljournal.org
ARCHIVES AT COGPRINTS
***

As we all know, Open Access Publishing is not gaining the momentum as
far as Journals published from Developing Countries are concerned [with
reference to western Journals]. Many reasons can be attributed like:

1. Monopolistic nature of Open Access Publishers like BioMedCentral
http://www. biomedcentral.com which pursues the "author pays"
and would drive away any author from Developing countries. Thus
obviously publishers from Developing countries would have second
thoughts before starting one at BMC.

By meaning monopolistic, I refer to the almost complete control over open
access publishing- say about >75% of open Access Journals in Medicine.and
Mega organisations like PLOS are crunching the small publishers, as they
can easily override the smaller ones with the mega funding they have.
see: http://bmj.bmjjournals.com/cgi/content/full/326/7392/766#art

2. As I previously stated in my Editorial in Internet Health-
www. virtualmed. netfirms. com/internethealth/articleapril03. html ,
the fear of losing revenue, which are the sole source of sustenance
of many Journals [though some make a meagre profit].

3. Lack of sufficient expertise and
exposure to Open Access Publishing. >>
www. virtualmed. netfirms. com/internethealth/opinion0303. html
http://bmj. com/cgi/eletters/326/7382/182/b <<

But recent developments are worth mentioning - at least from India. Online
Journal of Health and Allied Sciences www. ojhas. org , India's first
Online BioMedical journal declared a couple of months back that they
would go Open.

[I am in the Editorial board of OJHAS from Sept 2003]. OJHAS is
edited and published by a small group of scholars with no external
support. Everything from Web Design to Editing and Review are done by
voluntarily by the Editorial team. It also stands as a fine example of
the fact that Open Access Journals can indeed be successfully organised
and can indeed survive without an "author pays" model.

Now coming to the Archival, Cogprints was our first choice for many reasons

1] It offers interoperability [as mentioned by Harnad]
2] It offers unmatched popularity
3] It has been there for years and we can be sure of the permanence
4] It is of course FREE.

And as Harnad suggested, there is no reason why Journals should not
be archived at Open Archives, be it self maintained repositories or
Centralised ones. In fact Open Archiving of electronic journals is
the need of the hour because our own studies [unpublished] show that
Electronic journals are just as ephemeral as websites. Scholarly
communication should never be lost at the cost of copyright
restrictions. Many of these journals have perhaps done more harm than
good by locking the access by copyright restrictions.

Moreover, electronic journals are equally vulnerable to the vagaries
of the Internet. For example, JMIR www. jmir. org went suddenly offline
some time back [i think it was an year or so] making the whole content
inaccessible. [But it reappeared later and now is an Open Access Journal].

Thus in short, OPen Archiving of Journals as a whole is perhaps to be
discussed in a wider perspective than just making it OPEN. The major
emphasis should be the PERMANENCE of Open Archiving. I hope this post will
surely trigger a debate on the topic.

Kind regards

Dr. Vinod Scaria
Executive Editor: Calicut Medical Journal
Assoc Editor: Online Journal of Health and Allied Sciences
Editor in Chief: Internet He@ lth

WEB: www. drvinod. netfirms. com
MAIL: vinodscaria@yahoo. co. in
Mobile: +91 98474 65452

- Original Message -
From: Stevan Harnad
To: AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM@LISTSERVER. SIGMAXI. ORG
Sent: Wednesday, October 29, 2003 3:38 AM
Subject: Re: Central vs. Distributed Archives

The two items that follow below are by Vinod Scario from Peter Suber's
Open Access News http://www. earlham. edu/~peters/fos/fosblog. html

It provides an interesting and inspiring example of the power
and value of OAI-interoperability http://www. openarchives. org/
and the interdependence of the two open-access strategies (open-access
self-archiving and open-access journal publishing) that this new online
open-access journal, produced in India, is being made accessible
by archiving it http://calicutmedicaljournal. org/archives. html
in a specially created sector of CogPrints in the UK,
http://cogprints. ecs. soton. ac. uk/view/subjects/JOURNALS. html
a multidisciplinary central archive created in 1997 for author
self-archiving (which is now being done more via distributed institutional
eprint archives -- to which the CogPrints software was adapted by Rob
Tansley, creator of eprints http://software. eprints. org/#ep2 and then
of dspace http://www. dspace. org/ -- rather than via central ones like
CogPrints). 

Re: Central vs. Distributed Archives

2003-10-28 Thread Stevan Harnad
The two items that follow below are by Vinod Scario from Peter Suber's
Open Access News http://www.earlham.edu/~peters/fos/fosblog.html

It provides an interesting and inspiring example of the power
and value of OAI-interoperability http://www.openarchives.org/
and the interdependence of the two open-access strategies (open-access
self-archiving and open-access journal publishing) that this new online
open-access journal, produced in India, is being made accessible
by archiving it http://calicutmedicaljournal.org/archives.html
in a specially created sector of CogPrints in the UK,
http://cogprints.ecs.soton.ac.uk/view/subjects/JOURNALS.html
a multidisciplinary central archive created in 1997 for author
self-archiving (which is now being done more via distributed institutional
eprint archives -- to which the CogPrints software was adapted by Rob
Tansley, creator of eprints http://software.eprints.org/#ep2 and then
of dspace http://www.dspace.org/ -- rather than via central ones like
CogPrints). Yet there is no reason a central archive like CogPrints (or,
for that matter, any of the distributed institutional archives) cannot
provide a locus for open-access journals too! OAI-interoperability
means that they will all be picked up and integrated by cross-archive
harvesters like OOAster! http://oaister.umdl.umich.edu/o/oaister/

-

1. The Editorial of the Inaugural issue of Calicut Medical
Journal- Online, open access journals: the only hope for the future
http://calicutmedicaljournal.org/2003;1(1)e1.htm discusses in detail how
and why Calicut Medical Journal supports the Open Access initiatives.In
his editorial, Dr Ramachandran, stresses the need to disseminate knowledge
in the widest possible sphere, and especially between scholars of other
developing countries and asserts that Open Access is the best possible
solution to achieve this goal.The Editorial also criticises the widely
publicised " author pays" model as "discouraging" for scholars from
developing world and states it would badly affect the already low level
of publications from these countries. It also discusses the various
advantages of being Online and Open. He also asserts the need for more
regional Open Access Journals to meet the specific demands of scholars
and clinicians and for the maintenance and enhancement of the quality of
health services.The editorial concludes with the statement that Calicut
Medical Journal would play a dual role - being International by being
online ,Open and upholding the highest standards of publication,and at
the same time catering to the needs of Indian Scholars and Clinicians.
Posted by Vinod Scaria at 12:27 PM.


2. The Calicut Medical Journal is Online http://calicutmedicaljournal.org/
The much awaited Calicut Medical Journal is Online. The new Open Access
BioMedical Journal published by the Calicut Medical College Alumni
Association, is the second Indian Open Access BioMedical Journal. With
new Open Access medical Journals coming up in India, existing publishers
are already feeling the heat of competetion . While these two Open
Access Journals offer online acceptance of manuscripts, speedy peer
review and almost instant publication, with a host of utilities, and
ofcourse without a pricetag, other publishers are still in dark with their
outdated modes of peer review and publication. The web statistics of these
Journals are telltele signs of the fact that Open Access Publications
are widely embraced. Being Open Access, these Journals also aim to have
an International impact, which was hitherto virtually impossible in the
conventional publishing model.
Posted by Vinod Scaria at 12:22 PM.


Re: Central vs. Distributed Archives

2003-09-10 Thread Stevan Harnad
Ebs Hilf -- who will host a meeting on the subject next week:
http://physnet.physik.uni-oldenburg.de/projects/SINN/sinn03/programme.html
-- confirms that the rate of growth of the biggest and oldest open-access
archive -- the Physics Arxiv -- is still far, far too slow. I entirely
agree.

This does not diminish from the credit from Arxiv's having been the
first; but now, 12 years down the road, this unchangingly slow rate
suggests that something more may be needed than what has been feeding
Arxiv across the years, and my own guess (and Ebs's) is that that
something more may well be distributed institution-based self-archiving,
instead of Arxiv's central discipline-based self-archiving.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html

The reason institutional self-archiving is more likely to speed up
self-archiving and to generalize it across disciplines is that
researchers and their institutions both share the benefits of the impact
of their research output, whereas researchers and their disciplines do
not. It is not the discipline that exercises the incentive of
the "publish-or-perish" carrot-and-stick on researchers, it is their
research institutions. As the co-investor in and co-beneficiary of the
rewards of research impact (research funding, overheads, reputation,
prizes) the researcher's institution is in a position to mandate not only
"publish or perish" but "publish with maximal impact" -- which means
maximal access, which means open access, which means self-archiving.
http://www.ariadne.ac.uk/issue35/harnad/

I think on all this we agree with Ebs Hilf. Ebs too notes the likely
remedy for the sluggish growth rate of self-archiving in physics:
institutional (indeed, departmental) self-archiving. What is needed to
accelerate that is compelling empirical demonstrations of the correlation
between access and impact, to make researchers and their institutions
realize that self-archiving is in their own interest (and how much so)
-- in all disciplines.

There is, however, in Ebs's summary below, a rather important and
potentially misleading ambiguity: He conflates self-archiving with
publishing -- referring to depositing papers in Arxiv as "publishing"
them, in contrast to "self-archiving" them in institutional eprint
archives. But surely *both* of these are self-archiving and not
publishing! The publishing is done in the journals (in both cases). The
self-archiving is merely the provision of a supplementary version of
the paper, its full-text accessible online toll-free for all would be
users webwide (in either a central discipline-based eprint archive or in
distributed institution-based eprint archives).

Both central disicplinary archives like Arxiv and distributed
institutional archives include, in addition to the all
important peer-reviewed, published version of each article (the
"postprint") also the pre-peer-review preprint version(s) and
sometimes also postpublication updated and enhanced versions
("post-postprints"). But the critical version, and the one that
counts as the publication, is of course the published postprint:
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2239.html .
That (and not unpublished preprints or revisions) is what
"publish-or-perish" is all about!
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.4

But apart from these minor points, I don't think Ebs and I disagree. Here
is the quote/commentary:

On Wed, 10 Sep 2003, Eberhard R. Hilf wrote:

> Dear Stevan and the list members,
> here are some arguments for
> 1. All physicists will publish in the ArXiv not before the year 2050,
> although the arxiv size is growing quadratically, not linearly with time.
> Earlier estimates [St. Harnad,
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
> slide 25 are to be revised].
> [see http://isn-oldenburg.de/~hilf/ ]

If readers look at slide 25 above, they will find that according to
Ebs's estimate (which I accept!), it would have to be revised to extend
the linear growth from 2020 instead to 2050. According to Ebs, at the
present growth rate, 2050 would be the first year in which *all* physics
articles published in that year are self-archived in Arxiv.

But note that that's *self-archived* in Arxiv, not *published* in Arxiv:
There is absolutely no reason to believe that all those articles will
not continue (*exactly* as they all do now) being published in the
appropriate peer-reviewed journal for their area and their quality-level.
("Publication" will continue to mean, as it does now, peer-review and
certification of having met that journal-name's quality standards.)

And the rate of growth of the portion of total annual published journal
article output in physics that is self-archived will grow (linearly!) from
now till it reaches 100% in 2050, at exactly the same unchanging rate
at which it has been growing for 12 years now.

> 2. Usage of repositories seems to be proportional to their size,
> but independent of absolute size.
> The full text you find at
>

Re: Central vs. Distributed Archives

2003-09-10 Thread Eberhard R. Hilf
Dear Stevan and the list members,
here are some arguments for
1. All physicists will publish in the ArXiv not before the year 2050,
although the arxiv size is growing quadratically, not linearly with time.
Earlier estimates [St. Harnad,
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm
slide 25 are to be revised].

2. Usage of  repositories seem to be proportional to their size,
but independent of absolute size.
The full text you find at
http://www.isn-oldenburg.de/~hilf/publications/arxiv-analyis.ps

   physicists will publish in the ArXiv not before the year 2050
Here are some more elaborate but rather audacious risky estimates
(P.Ginsparg would know better).

The ArXiv is unique in that it serves its own usage and submission logs.

At present (after 146 months of service) there are 246.555 documents
stored.
The monthly rate of incoming new documents are at present 3.500. It rises
linearly with time, see
http://arxiv.org/show_monthly_submissions
Next month there will be 24 papers more per month handed in than this
month.

This allows to integrate it to get an estimate, at which future time
virtually all physicists would send in their prime papers to the ArXiv.

Let us estimate the number of physicists worldwide to be 1.000.000
of which 10 %
might be active as researchers, producing, say 2 papers per year.
Then we have 200.000 prime physics papers per year.
Integrating this  yields to see them all in ArXiv to be in 44 years and
six months from now, that is in the year 2050.


Clearly, by then we will have passed more technical revolutions, so that
this
steady state extrapolation is not likely to happen.

Other new developments may have a much steeper rise of spreading,
notably  the selfarchiving by the authors, their institutes or
Universities
and their libraries  forming a distributed net of repositories.

The advantage is its scalability, flexibility, the business model
(distributed funding by the institutions of the creators of the
documents),
the retaining of the author's rights, the update possibility,
and the acceptance spreading: to convince a large body  such as a
learned community to set up a  central service such as the ArXiv for
physics
is much harder, then to convince a percentage of local distributed
institutions
and institutes (the  multiple small versus one large barrier chance).

The challenges are to set up the  needed international standards,
to allow intelligent search engines to serve the retrieval,
to stimulate the discussion and communication between the authors,
-known in the past of beeing very conservative but not considerate of
their
working habits, and not very colloquial about it, used that they are being
taken care of and that someone else pays..

At present, the ArXiv is still unique in serving unconditional time stamp,
and long term readability.

 Is the usage is proportional to the size of a repository?
Reachout to and satisfaction of users of a repository may be estimated by
the ratio of pageviews per month
divided by the number of documents,

This ratio is astonishingly similar for different respositories even
of widely different size, may they contain documents or links.

For Marenet with its   1.595 links it is  1.9
for MPIVwith its   3.027 links it is  3.6
for Physnet with its   5.759 links it is  4.2
for VAB with its   2.655 links it is 10.4
for ArXiv   with its 245.056 docs  it is 16.3

All numbers are astonishingly low, as we know from libraries usage of
journals
and books.

Eberhard Hilf, h...@isn-oldenburg.de
Institute for Science Networking Oldenburg GmbH
at the Carl von Ossietzky University
http://www.isn-oldenburg.de

i
On Tue, 9 Sep 2003, Stevan Harnad wrote:

> On Mon, 8 Sep 2003, Eberhard R. Hilf wrote:
>
> > the physics ArXiv has a linear increase of the number of papers put in per
> > month, this gives a quadratic acceleration of the total content (growth
> > rate of Data base), not linear.
>
> Maybe so. But slide 25 of
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm (slide 25)
> still looks pretty linear to me. And it looks as if 100% was not only
> *not* reached at this rate 10 years after self-archiving started in
> physics in 1991, but it won't be reached for another 10 years or so...
>
> > Total amount by now may be at 10-15 % of all papers in physics.
>
> I count that as appallingly low, considering what is so easily
> feasible (though stunningly higher than any other field!)...
 >
> > Linear growth of input rate means the number of physicists and fields
> > using it rises, while in each field (and physicist) a saturation is
> > reached after a first exponential individual rise.
>
> Interesting, but the relevant target is 100% of physics (and all other
> disciplines) -- yesterday!
>
> > Never there will be a saturation such that all papers will go this way,
> > since in different fields culture and habits and requirements are
> > different. --
>
> I couldn't follow that: Never 

Re: Central vs. Distributed Archives

2003-09-09 Thread Thomas Krichel
  ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?= writes

> By "community-building", I mean that such archives can contribute to the
> creation or development of the identity of a scholarly community in
> research areas that go across the established disciplinary matrix of the
> university world.

  This crucial if self-archiving is to take off.


> I know the same thing can in principle be done with OAI-compliant
> university archives and a "disciplinary hub" or "research area hub", and
> in ten years time, we may not be able to tell the difference. But today,
> it is still not quite the same thing.

  Correct. This is a point that is too many times overlooked.

  RePEc (see http://repec.org) prodives an example for this in
  the area of economics. RePEc archives are not OAI compliant
  but an OAI gateway export all the RePEc data. Many RePEc
  services are in the business of community building. The
  crucial part, though, it RePEc's author registration service.



  Cheers,

  Thomas Krichel  mailto:kric...@openlib.org
  from Espoo, Finlandhttp://openlib.org/home/krichel
 RePEc:per:1965-06-05:thomas_krichel


Re: Central vs. Distributed Archives

2003-09-09 Thread Stevan Harnad
On Mon, 8 Sep 2003, Eberhard R. Hilf wrote:

> the physics ArXiv has a linear increase of the number of papers put in per
> month, this gives a quadratic acceleration of the total content (growth
> rate of Data base), not linear.

Maybe so. But slide 25 of
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.htm (slide 25)
still looks pretty linear to me. And it looks as if 100% was not only
*not* reached at this rate 10 years after self-archiving started in
physics in 1991, but it won't be reached for another 10 years or so...

> Total amount by now may be at 10-15 % of all papers in physics.

(10-15% of the annual output, I assume.)
I count that as appallingly low, considering what is so easily
feasible (though stunningly higher than any other field!)...

> Linear growth of input rate means the number of physicists and fields
> using it rises, while in each field (and physicist) a saturation is
> reached after a first exponential individual rise.

Interesting, but the relevant target is 100% of the annual output
of physics (and all other disciplines) -- yesterday!

> Never there will be a saturation such that all papers will go this way,
> since in different fields culture and habits and requirements are
> different. --

I couldn't follow that: Never 100%? Even at this rate? I can't imagine
why not. 

Cultural differences? Do any of the cultural differences between fields
correspond to indifference or antipathy toward research impact -- toward
having their research output read, used, cited? Unless the cultural
differences are specifically with respect to that, then they are
irrelevant.

Requirement differences? Are any universities or research funders
indifferent or averse to their researchers' impact? Unless they are,
any remaining requirement-differences are irrelevant. 

Habit differences? Well, yes, there are certainly those. But that is
just what this is all about *changing*! Are any field's current
access/impact practises optimal? or unalterable for some reason? If
not, then habit-change is (and always has been) the target!

And the point is that the rate of habit-change is still far too slow --
relative to what is not only possible, but easily done, and immensely
beneficial to research, researchers, etc. -- in all disciplines.

> [That is why it is e.g. best, to keep letter distribution by
> horses at a remote island (Juist) alive since the medieval times].

That I really couldn't follow! If you mean paper is still a useful back-up,
sure. But we're not talking about back-up. We are talking about open
online access, which has been reachable for at least a decade and a half
now, and OAI-interoperably since 1999. What more is the research cavalry
waiting for, before it will stoop to drink?

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
or
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2003-09-09 Thread Stevan Harnad
On Mon, 8 Sep 2003, ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?= wrote:

> I think it is still too early to write off any of the possible paths to
> open access within the field of self-archiving (not that you do that). I
> see a potentially very fruitful role for community-building archives
> that focus on certain research areas. These could be facilitated or
> mandated by some of the specialized public research institutions that,
> together with universities and private companies, inhabit the research
> landscape. I think of research institutions oriented towards applied
> research within for instance environmental research, agriculture, public
> health, education, community development, etc. Here, there is a clear
> two-sided research communication: towards the public and towards other
> researchers in the field. Open access thus serves two communicative
> purposes, improving scholarly communication and improving public access
> to research results, besides the complementary purpose of institutional
> self-promotion.

All true. And certainly a national research centre like
France's CNRS or INSERM or INRA (where Helene Bosc is so active
http://phy043.tours.inra.fr:8080/ ) or Germany's Max-Planck Institutes
or Italy's CNR or NIH intramural research groups or even CERN's
distributed research community could each create (a kind of) central
archive consisting of its own research output. It is clear how an
institutional policy could mandate this, and how this would be in the
joint interests of the researchers and their institution -- whether a
university or a distributed national research centre. These national
research centres, after all, are the hosts of the research and the
sponsors of the research, sharing its costs and the credits.

But it is not clear to me how any other kind of central entity (apart
from a research funding agency) could mandate self-archiving: What would
be the shared carrots? And what would be the pertinent sticks? I
certainly can't imagine a Learned Society (other than a research funder
or a research publisher) being able to induce its members or
co-disciplinarians to self-archive in the way a university or national
research centre could induce its researchers to do so. (But maybe others
with better imaginations than mine can think of a credible causal scenario?)

> By "community-building", I mean that such archives can contribute to the
> creation or development of the identity of a scholarly community in
> research areas that go across the established disciplinary matrix of the
> university world.

It would be nice to see a new subdisciplinary or multidisciplinary field
consolidate its existence by self-archiving collectively. But wouldn't
founding their own journal or journals be the more likely way they would
go about it? Each researcher in the new sub- or multidisciplinary field
presumably has his own institution, hence potentially his own
institutional open-access archives, all linked by the glue of
OAI-interoperability. The new sub- or multidisciplinary name that unites
them simply amounts to another metadata tag in OAI subject-space. There
is no need for the papers to sit physically in the same place.

But if it is more likely that these researchers will self-archive if they
have the new tag as the banner, and a dedicated archive as the locus,
more power to them!

> I have myself initiated an archive in research in
> organic agriculture (http://orgprints.org), which we hope will become a
> centre for international communication and cooperation in this area.
> Scientific papers from research in organic agriculture are published in
> many different specialized disciplinary journals as well as in general
> scientific journals and journals focused at organic agriculture, and it
> is not easy for researchers to keep track of all that is being
> published.

As noted, a unique field-descriptor tag would unify all this distributed
work as surely as a dedicated archive would, but if there really is a
greater incentive to self-archive for the sake of the new subfield than
for the sake of the impact of the research of each researcher and his
institution, then this will prove to be an interesting historical fact for
those who write the history of the slow and belated rise of open-access,
as optimal and inevitable as it have might been!

> I know the same thing can in principle be done with OAI-compliant
> university archives and a "disciplinary hub" or "research area hub", and
> in ten years time, we may not be able to tell the difference. But today,
> it is still not quite the same thing.

I note that Organic Eprints http://orgprints.org/ with 581 records has
over twice as many records as the average eprints.org archive (25,151
known records to date divided by 106 known archives = 237 records on
average) most of them institutional (though there are some much bigger
university archives, such as Lund's http://eprints.lub.lu.se/ with
2143 records!). But alas both that number and its competitors a

Re: Central vs. Distributed Archives

2003-09-08 Thread Eberhard R. Hilf
dear Colleagues,
the physics ArXiv has a linear increase of the number of papers put in per
month, this gives a quadratic acceleration of the total content (growth
rate of Data base), not linear.
Total amount by now may be at 10-15 % of all papers in physics.
Linear growth of input rate means the number of physicists and fields
using it rises, while in each field (and physicist) a saturation is
reached after a first exponential individual rise.

Never there will be a saturation such that all papers will go this way,
since in different fields culture and habits and requirements are
different. --
[That is why it is e.g. best, to keep letter distribution by
horses at a remote island (Juist) alive since the medieval times].
Ebs


.
Eberhard R. Hilf, Dr. Prof.;
CEO (Geschaeftsfuehrer)
Institute for Science Networking Oldenburg GmbH
an der Carl von Ossietzky Universitaet
Ammerlaender Heerstr.121; D-26129 Oldenburg
ISN-home: http://www.isn-oldenburg.de/
homepage: http://isn-oldenburg.de/~hilf
email   : h...@isn-oldenburg.de
tel : +49-441-798-2884
fax : +49-441-798-5851

On Mon, 8 Sep 2003, ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?= wrote:

> Stevan Harnad wrote:
> > Those are all OAI-compliant archives, and they include both central,
> > discipline-based archives and distributed institutional archives. With
> > OAI-interoperability, it doesn't matter which kind of OAI archive a
> > paper is in, but I am promoting university archives
> > http://www.eprints.org/self-faq/#institution-facilitate-filling
> > http://www.eprints.org/
> > rather than central ones (even though I founded a central one myself
> > http://cogprints.ecs.soton.ac.uk/ ) because researchers'
> > institutions (and
> > their research funders) all share in the joint
> > publish-or-perish interests
> > (and rewards) of maximizing the impact of their research
> > output. Central
> > repositories and disciplines do not. (They are the common locus for
> > research that is competing for impact.) Hence research institutions
> > (and their funders) are in a position to encourage,
> > facilitate, and even
> > mandate (through an extension of the publish-or-perish
> > carrot-and-stick)
> > open-access self-archiving of their own research output in
> > their own OAI
> > archive by their researchers, whereas disciplines and central
> > organizations (e.g., WTO, WHO, UNESCO) are not:
> > http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
> > http://www.ariadne.ac.uk/issue35/harnad/
>
> I think it is still too early to write off any of the possible paths to
> open access within the field of self-archiving (not that you do that). I
> see a potentially very fruitful role for community-building archives
> that focus on certain research areas. These could be facilitated or
> mandated by some of the specialized public research institutions that,
> together with universities and private companies, inhabit the research
> landscape. I think of research institutions oriented towards applied
> research within for instance environmental research, agriculture, public
> health, education, community development, etc. Here, there is a clear
> two-sided research communication: towards the public and towards other
> researchers in the field. Open access thus serves two communicative
> purposes, improving scholarly communication and improving public access
> to research results, besides the complementary purpose of institutional
> self-promotion.
>
> By "community-building", I mean that such archives can contribute to the
> creation or development of the identity of a scholarly community in
> research areas that go across the established disciplinary matrix of the
> university world. I have myself inititated an archive in research in
> organic agriculture (http://orgprints.org), which we hope will become a
> center for international communication and cooperation in this area.
> Scientific papers from research in organic agriculture are published in
> many different specialized disciplinary journals as well as in general
> scientific journals and journals focused at organic agriculture, and it
> is not easy for researchers to keep track of all that is being
> published.
>
> I know the same thing can in principle be done with OAI-compliant
> university archives and a "disciplinary hub" or "research area hub", and
> in ten years time, we may not be able to tell the difference. But today,
> it is still not quite the same thing. Contributing to the community
> would be detached from the usage of what is there, since the depositing
> of papers would take place somewhere outside the hub. This makes it
> dependent on the widespread existence of university archives. So if one
> wants to establish such an open-archive-based scholarly community hub,
> the way to do it is to make an eprint archive with the scope that one
> wants.
>
> > Having said that, it is still a historical fact that the first and
> > still-biggest o

Re: Central vs. Distributed Archives

2003-09-08 Thread ?iso-8859-1?Q?Hugo_Fjelsted_Alr=F8e?=
Stevan Harnad wrote:
> Those are all OAI-compliant archives, and they include both central,
> discipline-based archives and distributed institutional archives. With
> OAI-interoperability, it doesn't matter which kind of OAI archive a
> paper is in, but I am promoting university archives
> http://www.eprints.org/self-faq/#institution-facilitate-filling
> http://www.eprints.org/
> rather than central ones (even though I founded a central one myself
> http://cogprints.ecs.soton.ac.uk/ ) because researchers'
> institutions (and
> their research funders) all share in the joint
> publish-or-perish interests
> (and rewards) of maximizing the impact of their research
> output. Central
> repositories and disciplines do not. (They are the common locus for
> research that is competing for impact.) Hence research institutions
> (and their funders) are in a position to encourage,
> facilitate, and even
> mandate (through an extension of the publish-or-perish
> carrot-and-stick)
> open-access self-archiving of their own research output in
> their own OAI
> archive by their researchers, whereas disciplines and central
> organizations (e.g., WTO, WHO, UNESCO) are not:
> http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
> http://www.ariadne.ac.uk/issue35/harnad/

I think it is still too early to write off any of the possible paths to
open access within the field of self-archiving (not that you do that). I
see a potentially very fruitful role for community-building archives
that focus on certain research areas. These could be facilitated or
mandated by some of the specialized public research institutions that,
together with universities and private companies, inhabit the research
landscape. I think of research institutions oriented towards applied
research within for instance environmental research, agriculture, public
health, education, community development, etc. Here, there is a clear
two-sided research communication: towards the public and towards other
researchers in the field. Open access thus serves two communicative
purposes, improving scholarly communication and improving public access
to research results, besides the complementary purpose of institutional
self-promotion.

By "community-building", I mean that such archives can contribute to the
creation or development of the identity of a scholarly community in
research areas that go across the established disciplinary matrix of the
university world. I have myself inititated an archive in research in
organic agriculture (http://orgprints.org), which we hope will become a
center for international communication and cooperation in this area.
Scientific papers from research in organic agriculture are published in
many different specialized disciplinary journals as well as in general
scientific journals and journals focused at organic agriculture, and it
is not easy for researchers to keep track of all that is being
published.

I know the same thing can in principle be done with OAI-compliant
university archives and a "disciplinary hub" or "research area hub", and
in ten years time, we may not be able to tell the difference. But today,
it is still not quite the same thing. Contributing to the community
would be detached from the usage of what is there, since the depositing
of papers would take place somewhere outside the hub. This makes it
dependent on the widespread existence of university archives. So if one
wants to establish such an open-archive-based scholarly community hub,
the way to do it is to make an eprint archive with the scope that one
wants.

> Having said that, it is still a historical fact that the first and
> still-biggest open-access OAI archive is a central,
> discipline-based one,
> the Physics Archive founded in 1991 http://arxiv.org/. But
> Arxiv's growth
> rate has been steadily linear since 1991, and shows no sign of either
> accelerating or generalizing to all the other disciplines. So clearly
> something else was needed to hasten the open-access era, and my own
> hunch is that a concerted policy university-based archiving was what
> was needed.
> http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt>

What's wrong with linear growth? It must be the SIZE of the growth rate
that is important. And how long it will take to realize some satisfying
level of open access with this growth rate. When you are looking for
exponential growth, I take it that you are looking for something that
MIGHT turn out to have a higher maximum growth rate than, for instance,
arXiv. And that is all well, but it might be exponential and still have
a slower maximum growth than the linear growth we see in arXiv.

In the presentation that you refer to above, you write:
"At that rate, it would still take a decade before we reach the first
year that all physics papers for that year are openly accessible."

I think that this is an impressive and very satisfying growth. And I
don't think that a decade is too long - the great news is that physics
is ge

Re: Central vs. Distributed Archives

2003-09-03 Thread Stevan Harnad
On Wed, 3 Sep 2003, [identity deleted] wrote:
>
> Dear Mr. Harnad,
>
> I am also one of these stressed diploma-writers -- but very curious and
> enthusiastic. My subject is "the future of institutional
> archives".  I would be very pleased, if you could answer my questions:
>
> 1) Do you know anything about "non university archives", such as
> NonGovernmentOrganisations (i.e., WTO, WHO, UNESCO). Do these kinds of
> repositories already exist?

There are countless digital archives. You have to specify what *content*
you have in mind. This Forum (soon to be re-named the American Scientist
Open-Access Forum) is concerned *only* with scientific and scholarly
*research*, before and after peer-review (preprints and postprints).

Assuming that that is the content you are inquiring about, I suggest
that you have a look at the archives listed by the Open Archives
Initiative:
http://oaisrv.nsdl.cornell.edu/Register/BrowseSites.pl
as well as those indexed by
http://oaister.umdl.umich.edu/o/oaister/viewcolls.html

Those are all OAI-compliant archives, and they include both central,
discipline-based archives and distributed institutional archives. With
OAI-interoperability, it doesn't matter which kind of OAI archive a
paper is in, but I am promoting university archives
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.eprints.org/
rather than central ones (even though I founded a central one myself
http://cogprints.ecs.soton.ac.uk/ ) because researchers' institutions (and
their research funders) all share in the joint publish-or-perish interests
(and rewards) of maximizing the impact of their research output. Central
repositories and disciplines do not. (They are the common locus for
research that is competing for impact.) Hence research institutions
(and their funders) are in a position to encourage, facilitate, and even
mandate (through an extension of the publish-or-perish carrot-and-stick)
open-access self-archiving of their own research output in their own OAI
archive by their researchers, whereas disciplines and central
organizations (e.g., WTO, WHO, UNESCO) are not:
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ariadne.ac.uk/issue35/harnad/

Having said that, it is still a historical fact that the first and
still-biggest open-access OAI archive is a central, discipline-based one,
the Physics Archive founded in 1991 http://arxiv.org/. But Arxiv's growth
rate has been steadily linear since 1991, and shows no sign of either
accelerating or generalizing to all the other disciplines. So clearly
something else was needed to hasten the open-access era, and my own
hunch is that a concerted policy university-based archiving was what
was needed.
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt

> 2) I read about the Ingenta-Southampton cooperation concerning
> eprints-software in 2002. What has happend so far? Is there a result yet?

It's still there on paper, but Ingenta has not yet made any move to
implement or promote it. The idea had been that the Ingenta option
would be for those universities that did not want to be bothered with
maintaining their own OAI archives, and preferred to outsource it to
Ingenta. This is still a good idea, but the ball is in Ingenta's court;
Southampton has plenty to do already, with optimizing and maintaining
the GNU eprints.org archive-creating software it provides free to
universities, with creating tools for measuring and demonstrating the
impact of open-access research (to help induce researchers and their
institutions to self-archive) http://citebase.eprints.org/cgi-bin/search
and with trying to shape national and international self-archiving policy.

Other archive-creating softwares have since appeared too
but what is needed now is not more software, but more self-archiving,
and a clear, focused rationale, agenda and policy for it.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2670.html

> 3) Is there any other "serious" method of preservation expect OAIS?

Serious method of preservation for *what*? As noted, the Physics Arxiv,
which is OAI-compliant but not OAIS
http://www.rlg.ac.uk/longterm/oais.html is alive and well, and has been
since 1991. But the first, second and third objective of open-access
self-archiving is *access*, right now. The main preservation burden
for all the physics journal articles that are self-archived in Arxiv as
preprints and postprints is not on Arxiv but on each physics journal
publisher's primary corpus.

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2676.html
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2678.html

Please do not conflate the problem of open-access -- which is a
*supplement* to publishing in journals, not a *substitute* for it --
with the problem of digital preservation of journal content -- which
is a problem for journals, not for authors' institutional OAI archives.
And, in the same breath, don't conflate institutional OAI archives whose
pur

Re: Central vs. Distributed Archives

2003-08-05 Thread Stevan Harnad
[This is the reply to a query about the founding of a new disciplinary
open-access archive]

Congratulations on [archive name deleted]. Such central archives for
self-archiving are very useful and welcome. What will accelerate
self-archiving and open access still further, though, is institutions
self-archiving their own refereed research publications, in all
disciplines. (There don't exist central archives in all disciplines,
and central archives do not share joint interests [publish-or-perish,
citation-impact] with institutional researchers in the way their own
institutional archives do.)

What your archive will need, if its content is to grow, is some sort of
disciplinary or national policy of self-archiving in [country deleted]
It will not fill of its own accord. (CogPrints does not either.)
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ariadne.ac.uk/issue35/harnad/

> We would like to archive documents of [other-country] authors too, but we
> think copyright might be a problem. All authors [from our country] archiving
> their documents in [archive-name] sign an agreement. This agreement
> basically guarantees that no publisher's copyright is harmed by
> publishing this document in [archive-name]. In return we give the authors
> the right to delete the document from [archive-name]
> whenever they want to. We think this might be an obstacle for archiving
> documents authors [from other countries]: Our legal advisor told us that
> it is not simply possible to translate our agreement because American
> Copyright differs in very many ways from [our country's] copyright.
> Do you know about any agreements used by American Open Access Archives?
> Perhaps we might use the agreements in order to open [archive-name]
> to American Authors too.

Although you may wish to look at Project Romeo
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/
and http://www.eprints.org/self-faq/#10.Copyright
my (layman's) advice would be that what you describe above is
fine. Authors indicate that the article is theirs and they are entitled to
self-archive it; they may remove it if they wish (but are not encouraged);
and if the copyright agreement turns out not to allow self-archiving,
and the copyright-holder notifies you, then the archive itself will
remove it.

I might add that one of the problems (though it is a minor one) of
central archives, instead of the author's own institutional archives,
is that some publishers might (just might) be more inclined to request
removal from 3rd-party archives (i.e., neither the publisher's own nor
the author's own institutional archive), construing them (dubiously)
as 3rd-party publishers: This does not apply to the author's own
institutional archive, of course:
http://www.ecs.soton.ac.uk/~harnad/Temp/rcoptable.gif

I also suggest you always refer to self-archiving as self-archiving,
not (as you do in your statement above) as "publishing": The article is
published in the journal, and then that publication is self-archived
by its author in order to maximize its research impact by making
it open-access. It is clear that when the author self-archives his
own publication in his own institutional research archive he is not
*publishing* it: It is *already* published.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2259.html
Because of the nature of the web (and of open access), exactly the
same is true if he self-archives it in a central archive (as long as it
is not a publisher, re-publishing or re-selling the article), but the
author's own institutional archive is on more obviously firm ground there
(although in reality the difference is trivial, and will assuredly come
to be recognized as such once the air clears).

Stevan Harnad

NOTE: A complete archive of the ongoing discussion of providing open
access to the peer-reviewed research literature online is available at
the American Scientist September Forum (98 & 99 & 00 & 01 & 02 & 03):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html
or
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html

Discussion can be posted to: american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2003-04-16 Thread Stevan Harnad
Subject Threads:
 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1583.html
 http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html

> From: [identity removed]
>
> What I wish to emphasize... is the big difference between posting
> one's production on line in one's personal site, and sending it to an
> international server such as ArXiv...

Yes, you are quite right that there is this difference. See:

"Open Letter to Philip Campbell, Editor, Nature"
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2601.html

in which this point is explicitly discussed. Let me point out that
this point (about central-disciplinary versus distributed-institutional
self-archiving) is one of the three reasons I switched my own support
several years ago from central, discipline-based archiving (back) to local,
institution-based archiving (where I had started:
http://www.arl.org/sc/subversive/ ).

My three reasons for switching back were:

(1) OAI-interoperability has made central and distributed self-archiving
interoperable, hence jointly harvestable, searchable and navigable,
hence equivalent.

(2) Researchers and their institutions share a common interest in
maximizing their (shared) research impact (and its rewards), whereas
researchers and their disciplines do not. Institutions are hence in a
position to use "publish or perish" carrots and sticks to encourage
institutional self-archiving. Disciplines cannot (although of course any
disciplinary "culture" of self-archiving can be equally directed toward
central or institutional self-archiving). Hence institutional
self-archiving, once it catches on, can grow far faster than
disciplinary self-archiving.
http://www.ecs.soton.ac.uk/~harnad/Temp/archpolnew.html
http://www.ecs.soton.ac.uk/~harnad/Temp/Ariadne-RAE.htm

(3) Institutional self-archiving is truly *self*-archiving -- by the
author, of his own institutional research output, in his own
institution's research archive. And it is restricted *only* to the
output from researchers of that institution, made openly accessible
purely to maximize its impact. It is hence in a position to benefit from
the growing number of progressive self-archiving policies on the part of
publishers:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm

In contrast, a central, 3rd-party archive runs the risk of falling under
the (understandable) efforts of the publisher not to let *other*
publishers re-publish the work to which the original publisher has added
the value. (Of course, in the online and interoperable age this is moot
for give-away open-access research, because if something is openly
accessible to one and all on the web, it makes no difference whatosever
whether it is openly accessible from this website or from that
website! But central, 3rd-party archives are a psychological deterrent
because, being 3rd-party rather than "self," as the author's institution
is, it makes them -- in principle, but so far of course never in practise
-- open to publishers' claims of 3rd-party copyright-infringement by
a rival publisher. The author himself (and hence his own institution)
is immune to this, and hence can be the beneficiary of the retention of
the *self-archiving* right where a 3rd-party, central archive is not.

Anyway, since all OAI archives are interoperable and equivalent, I see
no reason at a time when self-archiving is still growing much too
slowly (compared to what would so easily be possible) to retard its
growth in any unnecessary way: Focussing on central discipline-based
archives and self-archiving is no longer necessary. Distributed
institution-based archives and self-archiving achieve the exact same end,
with at least one fewer obstacle (and at least one more incentive).

> Yes, as you say, most publishers allow authors to do the first thing
> [institution-based but not central self-archiving]: the APS, for instance,
> changed its copyright transfer form a few years ago to make this perfectly
> legal. I think that EPS did the same. But sending a document to a more
> general server such as ArXiv is another matter, and this is not permitted
> - at least for the moment (APS does not allow it for instance).

APS does not (yet) allow their *PDF files* to be
self-archived in ArXiv, but it does allow the final, revised
text to be self-archived. So this problem is trivial.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0749.html
http://forms.aps.org/author/copytrnsfr.pdf

What is less trivial (because it is *perceived* by authors as a
deterrent) is publishers' expressed opposition to 3rd-party (i.e.,
central) "self"-archiving. The simple and obvious solution is distributed
institutional self-archiving, linked by the glue of OAI.

> Most private websites are not permanent; experience shows that they are
> often not updated, not stable, and that their url sometimes disappear
> after a few years. This is, by the way, why we need centralized structure
> to ensure long 

Re: Central vs. Distributed Archives

2003-02-24 Thread Stevan Harnad
On Mon, 24 Feb 2003, Hugo Fjelsted Alrøe wrote:

> [Thread: http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html]
> 
> I have noticed that you lately recommend exclusively institutional eprint
> archives and not (inter)disciplinary archives. 
>
> Why is that? What are the reasons for not recommending disciplinary
> archives? As you well know, the most successful archive we have seen
> (arxiv.org) is disciplinary, and there are a few others on the way. 

Both institutional self-archiving and central self-archiving are
welcome and valuable contributions to open-access. Moreover, because of
OAI-compliance, they are all interoperable. So the short answer is that
it makes no difference. But there is a bit more:

Strategically, several years ago, I could see no reason why large
central archives like the Physics ArXiv should not subsume all of the
literature, in all disciplines. But gradally two problems become
apparent, along with their solutions:

Problem 1: ArXiv itself, though the biggest, is still growing too
slowly, even in Physics: It is growing linearly, which means it will
still be another decade before we arrive at a year when *all* of that
year's physics publications are self-archived.
http://arxiv.org/show_monthly_submissions

Problem 2: The central-archiving of ArXiv was generalizing even more
slowly to other disciplines: CogPrints (at 5+ years), another central
archive,  still only has about 1500 papers, compared to ArXiv's (at 11+
years) 200,000.
http://cogprints.ecs.soton.ac.uk/
http://www.earlham.edu/~peters/fos/timeline.htm

Solution 1: The Open Archives Initiative in 1999 provided an
interoperability protocol that effectively made all compliant archives
equivalent, whether they were central or institutional.
http://www.openarchive.org

Solution 2: What is needed to accelerate self-archiving is an *incentive*,
and it is clear that that incentive is something that is shared by a
researcher and his own institution, not a researcher and his discipline
or a central archive.
http://software.eprints.org/#ep2

The purpose of self-archiving is to maximize the visibility,
accessibility, usage and impact of one's research. In a word, to
maximize research impact. The benefits of research impact are shared by
researchers and their institutions. It is one of the main factors in
determining salaries, promotion tenure, research-funding, prizes and
prestige. These are all shared interests for researchers and their
institutions. They are behind the "publish or perish" injunction. This
means that the institution is not only a natural ally in self-archiving,
but it can even be the provider of the carrot and the stick, as an
extension of exactly the same considerations as those underlying
publish-or-perish: Maximize research impact.
http://www.ecs.soton.ac.uk/~harnad/Temp/self-archiving.ppt
http://www.ecs.soton.ac.uk/~harnad/Temp/unto-others.doc

It is for this reason that I think institutional self-archiving holds
greater promise for propelling open-access to critical mass than central
archiving -- or, as the effect is additive, I should really say: than
central archiving alone.

> If I am to guess, you might be thinking that authors can be pressured to
> place their papers in institutional archives by making it a condition in
> their employment contracts, or something similar. This pressure can also be
> applied in at least some kinds of disciplinary archives (such as
> http://orgprints.org), by way of making the condition in the research grant.
> And the motivation is straight forward: what the public pays for should be
> made publicly available.

I agree. And both of these pressures are welcome. But the institutional
self-archiving solution is more general, and pan-disciplinary. It is
easier to create and fill institutional archives (using local carrots and
sticks) than to create a central archive for each discipline and get all
researchers to fill it. Institutional self-archiving also benefits from
a wider institutional interest in making institutional digital output
and holdings (not just refereed research) openly accessible (though I
confess that this double mandate has been a 2-edged sword, also causing
confusion about what the target contents of institutional archives
should be, and thereby slowing rather than hastening the self-archiving
of refereed research output).

I would say that when an institution has adopted a policy of mandatory
self-archiving for all its researchers, it is easier and more general
to also provide the local archives to do it in, rather than to rely
on their being spawned and sustained by some external central entity
for each discipline. The policy is then also a uniform, self-conained and
self-sufficient one, whereas "self-archive somewhere" would have
been too vague and would not fit most disciplines yet (rather the way
"publish in an open-access journal" would be a premature injunction in
most disciplines and specialties today).

Last, there is a link between self-archiv

Re: Central vs. Distributed Archives

2003-02-24 Thread Hugo Fjelsted Alrøe
  [Thread: http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0293.html]

Dear Stevan

Just a question of clarification.

I have noticed that you lately recommend exclusively institutional eprint
archives and not (inter)disciplinary archives.

Why is that? What are the reasons for not recommending disciplinary
archives? As you well know, the most successfull archive we have seen
(arxiv.org) is disciplinary, and there are a few others on the way.

If I am to guess, you might be thinking that authors can be pressured to
place their papers in institutional archives by making it a condition in
their employment contracts, or something similar. This pressure can also be
applied in at least some kinds of disciplinary archives (such as
http://orgprints.org), by way of making the condition in the research grant.
And the motivation is straight forward: what the public pays for should be
made publicly available.

One possible benefit of (inter)disciplinary archives is that they can better
support a kind of 'community feeling' (which a journal can also sometimes
offer), and that this community feeling can help improve research
communication.


kind regards
Hugo Alroe


> -Oprindelig meddelelse-
> Fra: Stevan Harnad [mailto:har...@ecs.soton.ac.uk]
> Sendt: 19. februar 2003 16:32
> Til: american-scientist-open-access-fo...@listserver.sigmaxi.org
> Emne: Re: STM Talk: Open Access by Peaceful Evolution
>
>
> What researchers can and should do right now for OA is to self-archive
> their own refereed research output ("Self-Archive Unto Others As Ye
> Would Have Them Self-Archive Unto You") in their own institutional
> Eprint Archives, rather than to keel scolding publishers for not doing
> it for them -- *especially* as publishers (e.g., Elsevier) are
> now coming round to recognizing their own responsible role in all
> this, by formally supporting author/institution self-archiving:
>

>
> Stevan Harnad


Re: Central vs. Distributed Archives

2001-11-19 Thread Eberhard R. Hilf
dear Stevan,
thanks a lot for your somehwat summary of the topic up to now.
I agree with what you say. All paths leading to the same destination.
Indeed, we work on all three lines: encourage the authors, the
institutions to set up selfarchiving with our help or gate or not and
promote central archives.
I now daw you img files .
Ebs


Re: Central vs. Distributed Archives

2001-11-18 Thread Stevan Harnad
The current topic thread begins with:

Central vs. Distributed Archives
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0950.html

See also the earlier thread:

Central vs. Distributed Archives
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0294.html

On Sat, 17 Nov 2001, Eberhard R. Hilf wrote:

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1655.html
> eh> Steve said the only way is using OAi-compliance by the author to
> eh> self-archive his documents before and through refereeing.
> eh> 
> eh> The word "only" is too much of a load.
> eh> 
> eh> In Physics (and Mathematics) since a long time authors can self-archive
> eh> their documents, without having to install any software or learn about
> eh> OAi. They are automatically included into the OAi scheme by the
> eh> OAi compliant service providers by using PhysDoc (or Math-Net) as gateways
> eh> who take care of their document being included.

My comrade-at-arms Ebs Hilf has misinterpreted the sense of my "only."

He is of course quite right that central, discipline-based
self-archiving (in OAI-compliant Eprints Archives) is likewise an
effective and welcome form of self-archiving. However, as I wrote in
the very next posting:

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1654.html
> sh> The Physics Archive [http://arxiv.org], for example, has over 150,000
> sh> articles, but cumulated across 10 years! At that rate, even for this
> sh> most advanced of all the self-archiving disciplines, the year 2011 will
> sh> be the first in which ALL the articles published in physics that
> sh> year will be accessible for free for all:
> sh> 
> sh> http://www.ecs.soton.ac.uk/~harnad/Tp/Digitometrics/img001.htm
> sh> 
> sh> http://www.ecs.soton.ac.uk/~harnad/Tp/Digitometrics/img002.htm
> sh> 
> sh> This is why institution-based self-archiving now needs to be vigorously
> sh> supported and promoted to fast-forward us all to the optimal and
> sh> inevitable for research and researchers.

It was with this fact in mind that I had written written the earlier "only"
passage:

http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1653.html
> sh> The only sure way to free access to the entire refereed research
> sh> literature online, right now, is for researchers themselves to take the
> sh> initiative and self-archive it (in their own institutions' OAI-compliant
> sh> Eprint Archives: http://www.arl.org/sparc/pubs/enews/aug01.html#6 )

The force of the "only" was coupled with the sense of the "right now"!

A researcher in any particular discipline today (other than Physics,
Mathematics, or Cognitive Sciences) cannot take the initiative and
self-archive his refereed research in a central archive for his discipline,
because such central archives do not yet exist for most disciplines! Nor,
where they to exist, are they filling anywhere near fast enough (see the
2 Digitometrics links above).

Researchers' individual (and thereby collective) leverage (and rewards
for publication and impact) operates largely at the level of their own
institutions. Researchers need not install any software themselves, nor
learn anything about OAi. They need only encourage their own
universities to do so, out of shared self-interest in research
visibility, uptake and impact:

7. What you can do now to free the refereed literature online
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#7

"Online or Invisible?" (Steve Lawrence)
http://www.neci.nec.com/~lawrence/papers/online-nature01/

By way of OAI-interoperable central Eprint Archives, physicists and 
mathematicians 
today have http://arxiv.org and Ebs's PhysDoc (or Math-Net)
http://physnet.uni-oldenburg.de/PhysNet/physdoc.html and Cognitive
Scientists have http://cogprints.soton.ac.uk/

But for all the other disciplines, the fastest and surest path today is
to have their own institutions install their own OAI-compliant
Institutional Eprint Archives (using the free http://www.eprints.org
software) as a growing number of universities and research institutions
are now doing:

Institute of Education, University of London, London, England
University Library System, University of Pittsburgh 
http://philsci-archive.pitt.edu
Centre pour la Communication Scientifique Directe http://eprinttheses.in2p3.fr
Media Studies, University of Ulster, Coleraine, Northern Ireland
Formations Media Studies Archive http://formations2.ulst.ac.uk/
California Institute of Technology http://caltechcstr.library.caltech.edu/
Instituto Brasileiro de Informacao em Ciencncia e Tecnologia 
http://www.sbg.ibict.br
Institut Jacques Monod, Paris
Department of Philosophy, University of Vienna http://eprints.philo.at
University of Southampton, Southampton, UK http://demoprints.eprints.org/
RIACS, NASA Ames, Moffett Field CA http://horus.riacs.edu
University of Nottingham, Nottingham http://www-db.library.nottingham.ac.uk/ep1
University of Rochester Libraries http://128.151.45.180/
Sissa Multimedia Database http://mmdb.sissa.it/
University of Californi

Re: Central vs. Distributed Archives

2001-02-04 Thread David Goodman
As a relative outsider, I do not consider myself
qualified  to decide definitively which approach is best.

But even were I actively working on this, I would not consider it
appropriate to decide on the basis of argumentation. We on this list are
all either scientists, or information workers serving scientists.
We should know that the way to determine this is to try the different
approaches, and determine by observation and experiment. The problem is
large enough to accommodate several different large scale trials.

I suggest that for anyone working actively in this field, the best answer
if anyone criticizes part of one's concept, or proposes another, is to say
"go right ahead and try, but please keep it interoperable, and let us all
know your results."

And I would say the same to alternate schemes of reviewing, or any of the
issues that confront us. If anyone really has a need to argue, there are
enough people who do not yet see the necessity of changing the status quo
to serve for opponents.

Dr. David Goodman, Princeton University Biology
Library  dgood...@princeton.edu609-258-3235


Re: Central vs. Distributed Archives

2001-02-03 Thread Stevan Harnad
Greg, I honestly don't know what the substantive issue is that you are
disagreeing with me about. We are both for freeing the research
literature. We are both for self-archiving. We are both for
interoperability. We both agree that the Physics arXiv was the first to
show the way. We both agree that it would be good if the pace of
self-archiving were accelerated. We both agree that it would be good if
self-archiving spread to all disciplines.

So what is at issue here? That I have suggested that distributed
OAI-compliant self-archiving may help accelerate and spread
self-archiving whereas you think it won't? Well let's just wait and
see. You seem to have some reason for wanting to nip distributed
self-archiving in the bud, a reason that I can't fathom. Could it be
because it is "competing" with arXiv in mathematics? Who cares?
Self-archiving is self-archiving, and free is free.

As for interoperability, the reason I stress it is that that is what
will make the locus-differences between the individual archives
irrelevant. It will all be harvested into global virtual archives, and
those, not the individual archives, will be the locus classicus for the
research literature.

On Sat, 3 Feb 2001, Greg Kuperberg wrote:

> You don't just recommend institution-based archives, you hype them as
> superior to discipline-based archives.  You describe them as a "powerful
> and natural complement" that you hope will "broaden and accelerate the
> self-archiving literature".  I think you should add, more clearly than
> you have, that that part is only your opinion, and not that of the
> physicists and others who have "shown the way".

Greg, it seems to me "hope" is already at least as subjective and
hypothetical a descriptor as "opinion." Nor does "hope" equal "hype."
Nor do I say anything about "superior." I simply state the facts (and
hopes). The facts are that it started in Physics, in the form of
centralized self-archiving; but this is only growing linearly and not
generalizing across disciplines. Enter OAI-interoperability and the
possibility of complementing central self-archiving with distributed
self-archiving.

Why, one wonders, would any disinterested party (or rather, one with
an interest solely in freeing the literature, not in characterizing one
form of self-archiving as "superior") fail to welcome a complementary
form of archiving, rather than trying to dismiss it as hype and
opinion, or as contrary to the opinion of physicists?

"The freeing of their present and future refereed research from all
access- and impact-barriers forever is now entirely in the hands of
researchers. Posterity is looking over our shoulders, and will not
judge us flatteringly if we continue to delay the optimal and
inevitable needlessly, now that it is clearly within our reach.
Physicists have already shown the way, but at their current
self-archiving rate, even they will take another decade to free the
entire Physics literature
(http://www.ecs.soton.ac.uk/~harnad/Tp/Tim/sld002.htm) -- with
the Cognitive Sciences (http://cogprints.soton.ac.uk) 39 times
slower still, and most of the remaining disciplines not even
started: http://www.ecs.soton.ac.uk/~harnad/Tp/Tim/sld004.htm

"This is why it is hoped that (with the help of the eprints.org
institutional archive-creating software) distributed,
institution-based self-archiving, as a powerful and natural
complement to central, discipline-based self-archiving, will now
broaden and accelerate the self-archiving initiative, putting us
all over the top at last, with the entire distributed corpus
integrated by the glue of interoperability
(http://www.openarchives.org)."

> sh> Perhaps I should have said interoperable OAI-compliant archives.
> sh> And ir they exist, that's splendid. I hope there will be many more.
>
> This sounds like the Western leftists who insisted that China and the
> Soviet Union didn't practice true Communism.  If it is utterly irrelevant
> that many of the mathematical archives are interoperable and DC-compliant,
> why will making them interoperable and OAI-compliant make all the
> difference?  Granted, the OAI group may have made a better standard
> than the Dublin Core.  It's still insane to dismiss one as paganism and
> embrace the other as gospel.

Greg, I don't care! One of the purposes of interoperability is to make
sure it can all be harvested into global virtual archives like ARC
http://arc.cs.odu.edu/ thereby making the individual archive locus
irrelevant (and "empowering" distributed archiving). If DC-compliance
is enough to vouchsafe that, that's fine with me! Let 1000 flowers
bloom! *You* (not the Western leftists) are the one who seems to have
some sort of animus against these other archives!

And I think we are beginning to repeat ourselves (again). We have bet
on our respective horses. Can we now wait and see how they do in the
self-archiving sweepstakes? (I have the advantage tha

Re: Central vs. Distributed Archives

2001-02-03 Thread Stevan Harnad
On Sat, 3 Feb 2001 Greg Kuperberg  wrote:

> if I submit a paper to the arXiv...
> that is them archiving my papers, not me archiving my own.

Sorry, Greg, I don't find these details useful. This is terminological
niggling. (As long as we're at it, I prefer the word "depositing" to an
archive, because I "submit" to a journal.)

> The arXiv has a technical staff, admittedly small, and you could fairly
> call the staff members archivists.  The authors are not archivists.

And authors are not publishers either. Yet it is quite common to say
"I've published that paper."

What was needed was a term to describe the act of depositing a paper
into a free on-line archive for yourself, rather than relying on
someone else (e.g., a publisher) to do it for you. Self-archiving
describes that quite transparently.

(If I had to vote on it, I'd say most of the work of archiving itself
was being done by the software and the hardware, not the staff. But the
supporting staff are certainly essential, as they are even for personal
web-pages...)

> in your paper you do still imply that the arXiv is an example of
> "self-archiving".

And so it is. Authors can self-archive in centralized OAI-compliant
archives like arXiv or distributed institutional OAI-compliant archives
like the ones being set up using eprints.org software.

> Anyway, my *main* comment last time is that you don't even mention these
> points of disagreement in your article.  Your article has the bias that
> if people agree with you on the ends, it doesn't matter if they agree
> with you on the means.

Well it seems to me that in my article (1) I recommend self-archiving to
free the refereed research literature, and (2) I recommend self-archiving
in distributed institutional OAI-compliant Archives to complement
self-archiving in centralized OAI-compliant Archives.

Now in recommending this, what exactly do you think I should add? That
there are some people who think it's not worth complementing the former
with the latter? that they think we should just carry on with the
former as if there were no new possibilities for broadening and
accelerating the growth of self-archiving?

Why would I want to say that? Why would anyone want to say that?

> > On-Line archives (apart from the Physics arXiv) are all but non-existent.
>
> That's not true at all.  In mathematics alone the AMS has a list of 60+
> department-based and research-institute-based archives,

Perhaps I should have said interoperable OAI-compliant archives. And if
they exist, that's splendid. I hope there will be many more.

> Maybe a dozen of these independent archives are bigger, as measured by
> new submissions per month, than your CogPrints archive.  The biggest one,
> mp_arc, gets 30 new papers a month.  If you put them all together they
> are comparable in size to the math arXiv.

Good. Let them go OAI-compliant (perhaps by installing eprints.org
software!) and they will be making a valuable contribution to freeing
the refereed research literature (assuming they are not just for
unrefereed preprints!).

> But they're not growing as quickly as the math arXiv

So what?

> > I have no idea why you mention politics.
>
> Because deciding who gets to maintain the archives is political.
> People get service credit for it and they don't want to give that up.

Pity. Especially if it ever engenders a conflict of interest (as it has
done in journal publishing) between what's in the best interest of
research and researchers (maximizing free access) and what's in the
interests of "archivists."

> Some of the Europeans don't trust projects that they perceive as American.
> In mathematics, the numerous institution-based archives tend to satisfy
> administrators more and readers less.  They are useful, but they grow
> less quickly than the arXiv because they are less useful.  They aren't
> by any means the arXiv's savior.

Make 'em all OAI-compliant and it will no longer make a bit of
difference...


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00 & 01):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2001-02-03 Thread Greg Kuperberg
On Sat, Feb 03, 2001 at 07:10:10PM +, Stevan Harnad wrote:
> Well it seems to me that in my article (1) I recommend self-archiving to
> free the refereed research literature, and (2) I recommend self-archiving
> in distributed institutional OAI-compliant Archives to complement
> self-archiving in centralized OAI-compliant Archives.
>
> Now in recommending this, what exactly do you think I should add?

You don't just recommend institution-based archives, you hype them as
superior to discipline-based archives.  You describe them as a "powerful
and natural complement" that you hope will "broaden and accelerate the
self-archiving literature".  I think you should add, more clearly than
you have, that that part is only your opinion, and not that of the
physicists and others who have "shown the way".

> > > On-Line archives (apart from the Physics arXiv) are all but non-existent.
> >
> > That's not true at all.  In mathematics alone the AMS has a list of 60+
> > department-based and research-institute-based archives,
>
> Perhaps I should have said interoperable OAI-compliant archives. And if
> they exist, that's splendid. I hope there will be many more.

This sounds like the Western leftists who insisted that China and the
Soviet Union didn't practice true Communism.  If it is utterly irrelevant
that many of the mathematical archives are interoperable and DC-compliant,
why will making them interoperable and OAI-compliant make all the
difference?  Granted, the OAI group may have made a better standard
than the Dublin Core.  It's still insane to dismiss one as paganism and
embrace the other as gospel.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2001-02-03 Thread Stevan Harnad
On Fri, 2 Feb 2001, Greg Kuperberg wrote:

> On Sun, Dec 31, 2000 at 09:57:50PM +, Stevan Harnad wrote:
> >
> >   http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm
> >
> >   "Physicists have already shown the way, but at their current
> >   self-archiving rate, even they will take another decade to free the
> >   entire Physics literature"
>
> Of course you are entitled to your opinion that institution-based open
> archiving (sorry, I won't call it "self-archiving") is the bugle call
> of the revolution.

Terminology is terminology, but calling one's own archiving of one's own
papers "self-archiving" sure sounds like calling a spade a spade...

Besides, the Open Archives Initiative (OAI http://www.
openarchives.org) has informed me in no uncertain terms that I should
NOT characterize self-archiving as open-archiving or vice versa. The
OAI is a much broader initiative than the self-archiving initiative.

OAI is dedicated to providing shared interoperability standards for the
entire on-line digital literature, whether self-archived or not,
whether for-free or for-fee, whether journal, book or other, whether
full-text or not, whether centralized or distributed.

It is true that the OAI was originally proposed as the "UPS" (Universal
Preprint Service), which was indeed a form of self-archiving (though a
limited form, focussing on the unrefereed preprint rather than on both
the unrefereed preprint and the refereed postprint, as self-archiving
does). But "UPS" was quickly dropped and the OAI has since vastly
outgrown those limited original objectives.

> In my opinion, institution-based archives are,
>
> o in physics, all but superceded by the arXiv,

On-Line archives (apart from the Physics arXiv) are all but non-existent.

The hope is that institution-based, distributed self-archiving (perhaps
with the newfound help of the http://www.eprints.org archive-creating
software) will now remedy this.

And, as I said above, even in Physics, self-archiving is still growing
too slowly to free the Physics literature in less than a decade. It
seems to me that the central self-archiving model, admirable and
welcome though it is, can use all the help it can get.

> o in mathematics, a politically appealing distraction, and

I have no idea why you mention politics. The only "appeal" is to
researchers, that they should free their refereed research from their
obsolete access- and impact-barriers by self-archiving it, now. I have
no "political" preference for their doing it the central way or the
distributed way: We should all just go ahead and DO it!

I used to lean towards central self-archiving myself, seeing no reason
why it should not all be subsumed under arXiv; but that just isn't
happening, and the clock is ticking; so it's time to add more powerful
and general means of self-archiving.

Besides, the whole point of OAI-compliance and interoperability is that
it should no longer MATTER which way you self-archive: centrally or
institutionally. It's all harvestable into the same global virtual
archive anyway, thanks to the OAI protocol.

Unless one's "political" objective becomes, publisher-like, to protect
one's own proprietary (centralized?) turf instead of to free the
research literature...

> o in computer science and economics, the inadequate status quo.

I have no idea what you mean by the above.

> As I said before, I know that NCSTRL and RePEc, which are the efforts
> in computer science and economics to make institutional archives
> interoperable, are important major projects.  I don't mean to slight
> them.  But they are not a panacea and they do not match the arXiv.

Nobody is trying to "match" anything. We are trying to free the research
literature, as quickly and as effectively as possible.

> Computer science has a second important project, ResearchIndex/CiteSeer,
> which has some good features that the arXiv does not.  But (a) it doesn't
> match the arXiv either, (b) it relies on search engine intelligence and
> not bureaucratic standards, and (c) an arXiv search facility could be
> made as intelligent as CiteSeer.

I really can't follow any of this, and I have no idea who you think is
competing with whom for what:

ResearchIndex/CiteSeer is a wonderful tool, harvesting and
citation-linking papers on the Web, whether in OAI-compliant archives
or not. As the OAI-compliant corpus grows (with the growth of central
and distributed self-archiving), ResearchIndex/CiteSeer's harvest will
grow, and surely we all welcome that!

I don't know what you have in mind with "bureaucratic standards," but you
need not sell me on search-engine intelligence: I love it already.

Moreover, as the OAI-compliant corpus grows, it will spawn still
further and more powerful Open Archive Service Providers (e.g., OpCit
http://opcit.eprints.org and ARC http://arc.cs.odu.edu/).

But the main goal now is to do whatever can be done to make that corpus
grow into the full refereed literature in all disciplines as soon as
possible. This 

Re: Central vs. Distributed Archives

2001-02-03 Thread Greg Kuperberg
On Sat, Feb 03, 2001 at 10:28:19AM +, Stevan Harnad wrote:
> Terminology is terminology, but calling one's own archiving of one's own
> papers "self-archiving" sure sounds like calling a spade a spade...

In my opinion, if I submit a paper to the arXiv or to a hypothetical UC
Davis archive, that is them archiving my papers, not me archiving my own.
The arXiv has a technical staff, admittedly small, and you could fairly
call the staff members archivists.  The authors are not archivists.

> Besides, the Open Archives Initiative (OAI http://www.
> openarchives.org) has informed me in no uncertain terms that I should
> NOT characterize self-archiving as open-archiving or vice versa.

I suspect that that's because you don't take into account considerations
that they consider important.  In any case in your paper you do
still imply that the arXiv is an example of "self-archiving".

Anyway, my *main* comment last time is that you don't even mention these
points of disagreement in your article.  Your article has the bias that
if people agree with you on the ends, it doesn't matter if they agree
with you on the means.

> On-Line archives (apart from the Physics arXiv) are all but non-existent.

That's not true at all.  In mathematics alone the AMS has a list of 60+
department-based and research-institute-based archives,

http://www.ams.org/global-preprints/dept-server.html

and 16 subdiscipline-based archives,

http://www.ams.org/global-preprints/special-server.html

Maybe a dozen of these independent archives are bigger, as measured by
new submissions per month, than your CogPrints archive.  The biggest one,
mp_arc, gets 30 new papers a month.  If you put them all together they
are comparable in size to the math arXiv.

But they're not growing as quickly as the math arXiv, not even those
in Germany that enjoy an interoperable metadata standard and a common
search engine called MPRESS, http://mathnet.preprints.org .  MPRESS even
includes everything in the math arXiv.  MPRESS can be useful, but it is
not the panacea that you seem to expect it to be.

> > o in mathematics, a politically appealing distraction, and
> I have no idea why you mention politics.

Because deciding who gets to maintain the archives is political.
People get service credit for it and they don't want to give that up.
Some of the Europeans don't trust projects that they perceive as American.
In mathematics, the numerous institution-based archives tend to satisfy
administrators more and readers less.  They are useful, but they grow
less quickly than the arXiv because they are less useful.  They aren't
by any means the arXiv's savior.

> Besides, the whole point of OAI-compliance and interoperability is that
> it should no longer MATTER which way you self-archive: centrally or
> institutionally. It's all harvestable into the same global virtual
> archive anyway, thanks to the OAI protocol.

There lies MPRESS, the global virtual archive in mathematics,
and it still does matter.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Thomas Krichel
  Greg Kuperberg writes

> But I disagree entirely with the claim that distributed
> interoperability has never been tried before.  It has been tried several
> times, whole-heartedly with these two projects:
>
> MPRESS - mathnet.preprints.org
> NCSTRL - ncstrl.org
>
> And it has been a factor in many other projects, including Hypatia
> and the AMS preprint server.  Some of these projects are more
> successful than others, but *all* of them suffer from inconstancy
> of the underlying archives.

  The largest project that has been done with a distributed
  interoperability is RePEc. RePEc catalogs 11 items now.
  While there is the occasional case that an archive my become
  obsolete, from about 140 archives, I think 5 have been made obsolete,
  i.e. have been moved  to a place outside the original archive
  maintainer's control. Thus while it is problem, it is not a minor one.
  It is by far outweight by other advantages, such as distributed costs,
  minimum quality control, and wide community partipation.

  Cheers,

  Thomas Krichel http://openlib.org/home/krichel
 RePEc:per:1965-06-05:thomas_krichel

  2000-10-05 to 2001-01-06:
  Institute for Economic Research / Hitotsubashi University
  2-1 Naka / Kunitachi / Tokyo 186-8603 / Japan / +81(0)42 580 8349
  tho...@micro.ier.hit-u.ac.jp


Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Thu, 9 Nov 2000, Greg Kuperberg wrote:

> Entirely aside from whether your proposals are the best ones, you have
> previously described them as being nothing other than the "Ginsparg
> model".  Well I think of myself as devoted to the Ginsparg model,
> but my interpretation of it is significantly different from the one
> that you give here.  In 1997 my thinking was much more like yours,
> but three years of direct experience with the arXiv has changed it.

I used to think our models were the same (with one putting more
emphasis on central archiving and the other on distributed archiving).
I assumed that the ends were the same, and the difference between the
means relatively trivial: whatever gets the research literature up
there, online and free, and preferably yesterday, is welcome.

But there has always been a disagreement on the subject of peer review
(pre- vs. post-prints, if you like). My "invisible hand" hypothesis was
formulated in response to the notion that the meaning of preprint
self-archiving was that peer-review and journals would go, and
"preprint archives" would take their place. Not even an epsilon of a shift
in this direction has occurred, even in Physics, with the growth of
self-archiving. Nor, by my lights, would it be at all a good thing if
it did. See:

Harnad, S. (1998/2000) The invisible hand of peer review. Nature
[online] (5 Nov. 1998)
http://helix.nature.com/webmatters/invisible/invisible.html
Longer version in Exploit Interactive 5 (2000):
http://www.exploit-lib.org/issue5/peer-review/
http://www.ecs.soton.ac.uk/~harnad/nature2.html

But I was always convinced (and still am) that this difference of
opinion about the present and future causal role of peer review was
irrelevant, because authors were self-archiving both pre-peer-review
preprints and post-peer-review postprints all along anyway, and
virtually all the papers in arXiv eventually end up published in
peer-reviewed journals (or conference-proceedings).

So when I launched CogPrints, I explicitly announced that it was intended
for eventual subsumption under arXiv (then under its prior name).

But that was all predicated on continuing accelerated growth in
self-archiving. Now, there has been growth, but it has not been nearly
fast enough, either in arXiv (where it is still linear, and will not
capture the entire Physics literature for another decade at this rate)
or in CogPrints, where it is not even linear.

So, with the Open Archive Initiative, and the new prospect of
interoperability between distributed OAI-compliant Eprint Archives, I
returned to my 1994 "subversive proposal" as a way to help speed things
up, and commissioned the upgrading of the CogPrints software into
generic institutional, OAI-compliant Eprint archive-creating software:
http://www.eprints.org

Harnad, S. (1995) Universal FTP Archives for Esoteric Science and
Scholarship: A Subversive Proposal. In: Ann Okerson & James
O'Donnell (Eds.) Scholarly Journals at the Crossroads; A Subversive
Proposal for Electronic Publishing. Washington, DC., Association of
Research Libraries, June 1995.
http://cogsci.soton.ac.uk/~harnad/subvert.html
ftp://ftp.cogsci.soton.ac.uk/pub/psycoloquy/Subversive.Proposal/
http://www.arl.org/sc/subversive/

> My creed is, build a large, integrated, immortal archive now, and the
> e-prints will come tomorrow.

My creed is to get the eprints up there now (it's already late in the
day, relative to the time it has been within reach), and let the
immortality take care of itself (as it most certainly will do, and
easily).

> I won't insist that this approach is right for your discipline, because
> maybe you know your own community better than I do.  But I do feel
> strongly that it is right for my discipline.

It's not about disciplines (and I'm not just trying to liberate the
cognitive science literature but the refereed literature in all
disciplines). And I don't think sublinear or linear growth is right for
your discipline (maths) either...

> In general your liberation terminology doesn't sit so well with me.

It's a pity, because that is what it's about. Right now, sitting behind
toll-gates, is an author-give-away literature, one that has always been
author-give-away, but was prevented, by Gutenberg-era costs and
constraints, from being given away on the scale authors would have
liked all along, because the only mass-distribution means (on-paper)
had to have its (high) costs met, or else there could be no
distribution at all.

That literature can at last be freed from those unnecessary
access-barriers. If that is not what you are working to achieve,
what are you working to achieve? The journals are virtually all on-line
now, so the medium is not the problem.

Could it again be this sticking point about PREprints? But don't you
see that it amounts to the same thing? Preprints are and always have
been the earlier embryonic stages of postprints, and people who free
their preprin

Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Thu, 9 Nov 2000, David Goodman wrote:

> Steve, I think you misunderstand Greg's concern (and mine) We do not
> disagree with what you want to do; we want to add to it. We are
> assuming, I think, that something similar to the plan you advocate will
> be the basic process.
>
> I do not think it enough to say distributed = secure. It's only the first
> step to security. In addition to being distributed, there also needs to
> be a reliable caretaker--not just to do the housekeeping, but to ensure
> that the archive is kept compatible with changing technology.

I agree completely.

I didn't say distributed = secure (there's a lot more to security than
that). I said being freely accessible now, in distributed institutional
Eprint archives is a powerful new way to complement being freely
accessible in centralized Eprint archives, which are still growing much
too slowly. It should not be delayed for one moment by security
concerns, not one moment.

> I suggested that the archives be organized redundantly both by
> discipline and by university (and possibly by geographic/political
> entity, as well as what anyone wants to do).

Again, complete agreement.

> There are undoubtedly well-organized academic departments that can do
> this. There are also academic departments that cannot be relied on to
> do this right, because of size, interest, or finances. The same goes
> for professional societies. Certainly no individual can be relied on:
> all humans are mortal. All of this goes as well for refereed as for
> unrefereed, preprint as for reprint, officially published as for
> unpublished.

Agreed, and digital librarians are clearly the pertinent experts.

> As a librarian, I do not assume it is good enough that our refereed
> papers are already, as they are, safely in the hands of journals and
> libraries, ...

Yes, but let us not again mix up agendas. There could have been --
independent of any movement to free the refereed literature online -- a
movement to increase the security of the on-paper corpus (both papers
and books) on-line.

That's fine, desirable, but unrelated to this Forum's agenda, which is
to FREE the refereed corpus online. Concerns about strengthening the
paper literature's current security should not be wrapped into the
freeing (now!) initiative for the refereed literature; nor should
freeing (now!) be made in any way conditional on first meeting a priori
security concerns. Although it is an oversimplification, it is best to
treat the freeing initiative as a pure freebie, a windfall, over and
above what we have already. We are talking about archiving, not
publishing, an extra version of what is already published (on-paper).

This face-valid, immediate goal should be kept as distinct from
preservation concerns as it should be kept from peer-review-reform
concerns (likewise worthy, but orthogonal, and indeed even at
cross-purposes if yoked in any way to the freeing initiative).

> There are very few library copies of many journals, and though there is
> excellent backup from national libraries, even their collections are
> incomplete. The literature published up to now will be much more secure
> when it too has been digitized and placed on free publicly available
> mirrored servers, with all the additional precautions. Besides
> security, this will also make them generally available with all the
> additional advantages of plans such as yours.

David, the securing issue is a separate one from the freeing! The
material on the shelves now is not free; nor is it, let us agree, as
secure as it might be. Increasing its security by distributed digital
back-up is one thing (and need not be freely accessible either);
freeing it online is quite another.

Please, please keep these two separate or you will only encourage more
Zeno's Paralysis!


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2000-11-09 Thread Tim Brody
> > Greg:
> > As a rule, it is better for web sites to share the same archive than
> > to each have fragments. It is better for Oxford and Cambridge to
> > each have all of Shakespeare's plays than for Oxford to have only the
> > comedies and Cambridge to have only the tragedies. That is why I favor
> > shared interoperability, which is in some ways centralized, to fragmented
> > interoperability, which is optimistically called decentralized. Massive
> > redundancy is one of the few strengths of the existing paper-based system;
>
> Stevan:
> I am not an expert on digital storage, coding or preservation, but I am
> not at all sure that Greg is technically right above (and I'm certain
> that the Oxford/Cambridge hard-copy analogy is fallacious). I would
> like to hear from specialists in localized vs. distributed digital
> coding, redundancy, etc. -- bearing in mind that in the case of the

If I may separate the political issues from the technical.

Political:

There is a fear that a decentralised system will result in no overall
"responsibility" for archive continuity. But, equally, a centralised
body can decide that a system is no longer useful or is too expensive
to be free - what happens if XXX goes pay-per-view? What rights do
mirrors have to store XXX if they are told to remove their archive?

Technical:

The fear is that there will be only one copy of a paper stored in an
institution department or library and if that archive is lost that
paper disappears into digital oblivion.

Data storage is very cheap - there is little difference between storing
1 or 100 copies. Oxford and Cambridge could farm all world physics
archives and store their contents. This is not currently done because
Open Archives include pay-per-view archives, where only the abstract
can be farmed - and hence there is no provision for farming of texts.

I may also point out that there are already archives that perform
distributed mirroring - math arXiv is primarily made up of papers that
have been archived elsewhere (judging by the lack of associated meta
data and updates).

Tim Brody
Computer Science, University of Southampton
email: tdb...@soton.ac.uk
Web: http://www.ecs.soton.ac.uk/~tdb198/


Re: Central vs. Distributed Archives

2000-11-09 Thread David Goodman
Steve, I think you misunderstand Greg's concern (and mine)
We do not disagree with what you want to do; we want to add to it. We are
assuming, I think,
that something similar to the plan you advocate will be the basic process.

I do not think it enough to say distributed=secure. It's only the first step
to security.
In addition to being distributed, there also needs to be a reliable
caretaker--not just to do the housekeeping, but to ensure that the archive is
kept compatible with changing technology.
I suggested that the archives be organized redundantly both by discipline and
by university (and possibly by geographic/political entity, as well as what
anyone wants to do).

There are undoubtedly well-organized academic departments that can do this.
There are also academic departments that cannot be relied on to do this right,
because of size, interest, or finances. The same goes for professional
societies. Certainly no individual can be relied on: all humans are mortal.
All of this goes as well for refereed as for unrefereed, preprint as for
reprint, officially published as for unpublished.

As a librarian, I do not assume it is good enough that
> our refereed papers are already, as they are,
> safely in the hands of journals and libraries, ...

There are very few library copies of many journals, and though there is
excellent backup from national libraries, even their collections are
incomplete. The literature published up to now will be much more secure when
it too has been digitized and placed on free publicly available mirrored
servers, with all the additional precautions. Besides security, this will also
make them generally available with all the additional advantages of plans such
as yours.


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 07:16:47PM +, Stevan Harnad wrote:
> I don't think sublinear or linear growth is right for
> your discipline (maths) either...

Of course more growth is better than less.  Several of us (both the arXiv
staff led by Paul Ginsparg and the math advisory committee chaired by Dave
Morrison, on which I serve) have worked hard to accelerate the growth of
the math arXiv.  I can report a partial victory.  The archives that we
glued together were at best growing linearly with a low slope and were
showing some signs of sublinearity.  After we put them together there was
a discontinuous increase in new submissions, and linear growth commenced
with a higher slope.  I don't have a chart but the numbers are there at

http://front.math.ucdavis.edu/math

After we had changed so much, I was surprised that growth was still
linear.  (Paul Ginsparg wasn't surprised.)  I now believe that linear
growth in e-prints is inherent.  But both the discontinuity and the
one-time change in slope were heartening.  That is a realistic goal when
you change the system.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Wed, 8 Nov 2000, Greg Kuperberg wrote:

> While libraries certainly should help preserve e-prints, I do not trust
> any one library, nor any other sole institution, to archive material
> single-handedly. Any caretaker can lose or destroy a unique copy of
> any document...  That is why it is important to redundantly and
> openly mirror an archive and not just allow third-party searches. The
> arXiv has 18 mirror sites on six continents

Who is disagreeing with this? All requisite redundancy is just as
desirable, and feasible, and inevitable, with institution-based
distributed archiving as with discipline-based archiving.

I think there is an incorrect analogy at the heart of Greg's frequent
use of the term "fragmented" in speaking about the institution-based
approach to self-archiving:

I think Greg continues to equate (1) archiving with publishing, and
(2) institutional digital "collections" with localized "books-on-shelves"
(ripe for a Library-of-Alexandria catastrophe; hence his example of the
lost/destroyed "unique document"). And (3) (unrefereed, unpublished)
PREprints continue to be treated as the "paradigm" for it all, whereas
it is much more informative and representative to see it in terms of
(refereed, published) POSTprints: We are, after all, aiming at freeing
the REFEREED literature -- with the prepublication embryological stages
merely an added bonus, rather than the focus of it all.

So, to summarize: Whilst, our refereed papers are already, as they are,
safely in the hands of journals and libraries, blissfully mirrored
(though unblissfully unfree), we need not fret about Alexandria.
Freeing a postprint (sic) via self-archiving (whether central or
institutional, interoperable or not) is a bonus, a plus, a freebie, a way
to make it accessible to those multitudes worldwide who cannot access
it because of the S/L/P firewalls surrounding the safe, Alexandria
versions.

It is inviting Zeno's Paralysis (again) to say: "Keep waiting till you
have an Alexandria-proof centralized, mirrored, redundant arXiv-style
Archive to self-archive them in before you dare to self-archive your
(already safely mirrored) postprints."

Nay! Release them from their hostagehood behind obsolete,
impact-blocking, and completely surmountable access barriers online
today through self-archiving, addict fellow-researchers the world over
to that new, free form of access to it all, and the redundancies and
mirrors will come tomorrow, in plenty of time to keep the freed corpus
aloft in the skies. (And nothing is at risk: the firewalled version
remains as safe -- from catastrophic loss as well as illicit access --
as it ever was.)

If that is now transparent for postprints, it should be equally
transparent that the same applies to preprints: They are destined to
become postprints (hence secure, for the above reasons) anyway. Being
available online early is a bonus; a freebie. Moreover, it is bonus
that has no prior history of enjoying the safe/secure status of
postprints anyway: access to preprints was always restricted and
evanescent, destined to be superseded by the secure postprint once it
was available.

Now the redundancy and mirroring that will be accorded the freed
postprint corpus, once it is freed, will also be inherited by the
preprint corpus.

So there is nothing to lose, and everything to be gained, by
self-archiving all preprints and postprints now, in either the
centralized OAI-compliant (http://www.openarchives.org) archives like
arXiv (http://arXiv.org), or in institutional OAI-compliant archives,
like Eprints (http://www.eprints.org).

Ignore Cassandras: Preservation problems are eminently soluble, once
the goods are up there: the real problem now is how to get researchers
to put them up there, at long last. Central archives have gone part of
the distance but are proving too slow. Institutional archives are natural
allies in hastening us on the road to the optimal and inevitable.

> As a rule, it is better for web sites to share the same archive than
> to each have fragments. It is better for Oxford and Cambridge to
> each have all of Shakespeare's plays than for Oxford to have only the
> comedies and Cambridge to have only the tragedies. That is why I favor
> shared interoperability, which is in some ways centralized, to fragmented
> interoperability, which is optimistically called decentralized. Massive
> redundancy is one of the few strengths of the existing paper-based system;

I am not an expert on digital storage, coding or preservation, but I am
not at all sure that Greg is technically right above (and I'm certain
that the Oxford/Cambridge hard-copy analogy is fallacious). I would
like to hear from specialists in localized vs. distributed digital
coding, redundancy, etc. -- bearing in mind that in the case of the
refereed literature, this is all moot anyway, because free access now,
is infinitely preferable to no access, no matter how short-lived it
risks being. The "locus classicus" is still safely ensc

Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 05:58:14PM +, Tim Brody wrote:
> I may also point out that there are already archives that perform
> distributed mirroring - math arXiv is primarily made up of papers that
> have been archived elsewhere (judging by the lack of associated meta
> data and updates).

I don't understand this comment.  Most of the papers in the math arXiv
are eventually published, and many are in preprint series of one sort
or another.  However I conjecture that at least half of the submissions
in the most recent three months are not on any other web site, not
even on a home page.  And for those that are not published or not yet
published, the arXiv is the only project that explicitly promises to
keep them permanently.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 11:16:11AM +, Stevan Harnad wrote:
> Nay! Release them from their hostagehood behind obsolete,
> impact-blocking, and completely surmountable access barriers online
> today through self-archiving, addict fellow-researchers the world over
> to that new, free form of access to it all, and the redundancies and
> mirrors will come tomorrow, in plenty of time to keep the freed corpus
> aloft in the skies.

Entirely aside from whether your proposals are the best ones, you have
previously described them as being nothing other than the "Ginsparg
model".  Well I think of myself as devoted to the Ginsparg model,
but my interpretation of it is significantly different from the one
that you give here.  In 1997 my thinking was much more like yours,
but three years of direct experience with the arXiv has changed it.
My creed is, build a large, integrated, immortal archive now, and the
e-prints will come tomorrow.  I won't insist that this approach is right
for your discipline, because maybe you know your own community better
than I do.  But I do feel strongly that it is right for my discipline.
And I can't speak for Paul Ginsparg either, but I would be surprised
if he contradicted me outright, since he has influenced my thinking a
great deal through direct correspondence.

In general your liberation terminology doesn't sit so well with me.  I do
hint at liberation terminology from time to time; in fact the name of my
front end, "Front for the Mathematics arXiv", is a deliberate allusion.
If the math arXiv is revolutionary, I would liken it to the American
revolution.  We are building a new system on new territory and letting
immigrants come.  I see a lot of Alexander Hamilton in our approach, and
somewhat less of Thomas Jefferson.  Your comments have some character
of Jefferson, but very little of Hamilton, and often they sound almost
Marxist.  I might compare your overall vision to the Communards of Paris.
But hey, you could be right in your own society.

You have also correctly picked up that I don't accept the dichotomy
between preprints and postprints.  My view is that the preprint
and the postprint are Tweedledum and Tweedledee.  But that is a topic
for another posting.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-08 Thread Greg Kuperberg
On Wed, Nov 08, 2000 at 12:30:39PM -0400, David Goodman wrote:
> Departments are not the place, for exactly the reasons John explains. More
> than one of the academic depts. in more than one major university I have been
> affiliated with has managed to lose unique copies of Ph.D. theses, as well as
> every other possible type of item.

The fact is that most math papers on the web (excluding those in the
arXiv) are on department, and not campus-wide, web servers.  This is
even true of papers that are organized into preprint series.  One of the
dangers of an interoperability approach is to hoist the e-print vision
on such an accidental foundation.  I also agree with John MacColl's
position that libraries are more reliable archivists than departments
in principle.  But I disagree entirely with the claim that distributed
interoperability has never been tried before.  It has been tried several
times, whole-heartedly with these two projects:

MPRESS - mathnet.preprints.org
NCSTRL - ncstrl.org

And it has been a factor in many other projects, including Hypatia
and the AMS preprint server.  Some of these projects are more
successful than others, but *all* of them suffer from inconstancy
of the underlying archives.

While libraries certainly should help preserve e-prints, I do not trust
any one library, nor any other sole institution, to archive material
single-handedly.  Any caretaker can lose or destroy a unique copy of
any document.  (Just last year the Boston Public Library lost thousands
of books in a flood, for example.)  That is why it is important to
redundantly and openly mirror an archive and not just allow third-party
searches.  The arXiv has 18 mirror sites on six continents, listed at:

http://arxiv.org/servers.html

That is not as many copies of the arXiv as I would like to see, although
it is enough full-fledged active mirrors.  More significantly anyone
who wants to can maintain yet another copy of the arXiv following the
instructions at:

http://front.math.ucdavis.edu/scripted

As a rule, it is better for web sites to share the same archive than
to each have fragments.  It is better for Oxford and Cambridge to
each have all of Shakespeare's plays than for Oxford to have only the
comedies and Cambridge to have only the tragedies.  That is why I favor
shared interoperability, which is in some ways centralized, to fragmented
interoperability, which is optimistically called decentralized.  Massive
redundancy is one of the few strengths of the existing paper-based system;
let's not tear up the road in addition to scrapping the horse carriage.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-08 Thread David Goodman
Departments are not the place, for exactly the reasons John explains. More
than one of the academic depts. in more than one major university I have been
affiliated with has managed to lose unique copies of Ph.D. theses, as well as
every other possible type of item.
I think this is an appropriate role for libraries in two dimensions: each
university library should take the responsibility for all publications by its
faculty and students, AND appropriate major libraries or groups of libraries
could also take the responsibility for specific research areas that are not
being otherwise covered.
If university presses wanted to participate I think most libraries would
welcome the partnership.

The systems are inexpensive enough for redundancy to be affordable, and this
might be one solution to the refereed/nonrefereed controversy. It only
requires adequate cross-archive indexing.

A part of the savings could be used to increase the number of librarians
helping the other members of the university navigate the new system. Most
users need help in navigating the present system (the higher the academic
level the more likely they are to request it, because they know enough to
realize they need it). They will need it all the more during the period of
transition. Nothing in the prior course of human-developed systems gives
reason to suppose they will need it less even after the transition is
complete. (If the AI people think they can compete in this, I encourage them
to keep trying.)

John MacColl wrote:
>
> Greg Kuperberg wrote:
>
> > > So should we mathematicians trust individual math departments to
> > > permanently preserve their e-prints? I don't think so. Our own math
> > > preprint series at UC Davis is an arXiv overlay - all articles are
> > > automatically contributed to the math arXiv. One of my
> > arguments for this
> > > arrangement is that we can't promise to babysit these preprints forever.
> > > We could easily forget our obligation.
>
> Stevan Harnad replied:
> >
> > The Department could easily forget; the institutional library is unlikely
> > to do so. It has a lot of prior practise with stability/permanence! (And
> > it has a good deal to gain from maintaining robust institutional Eprint
> > Archives: The prospects of serials-crisis relief, as other
> > institutional libraries do the same thing, with their own Eprint
> > archives --
>
> I would concur with this response, and would wish to develop a couple of
> points about why libraries are important in the freed literature scenario.
> Interestingly, the notion of  'forgetting' gives a new dimension to the
> notion of libraries as 'memory organisations'. They are no longer simply
> memory organisations in the sense of storing knowledge, as in a memory, but
> particularly as that knowledge becomes networked they are becoming
> organisers of access, for which function their contribution to their parent
> institution is to understand information structures, sources and
> presentations. This requires that they are memory joggers as well as memory
> fillers. That has always been true, but internet publication has increased
> both the complexity of these structures, and the rate of publication. More
> and more the challenge for academic libraries is to preserve the roles of
> hunters and collectors of knowledge in the age of internet publishing: that
> requires that they take a much more active approach to identifying and
> maintaining knowledge than was required in the age of print, when libraries
> had adapted to the culture of publishers, and had settled into a role which
> was primarily passive.
>
> But as Stevan says, interoperability in the world of eprint archives has not
> been tried before (and therefore cannot be criticised as the wrong model).
> More than that, it is at present the only model really capable of surviving
> in the world of internet publishing, and it conforms to the way librarians
> see publishing culture moving, which is why the library profession is so
> concerned with metadata - the key to the knowledge structures which are in
> transition. In the passive model, academics and researchers ordered books
> and journals via the library, and the library sought to ensure that the
> material which arrived in the form of physical product was organised
> optimally. Now, we find academics and researchers creating web sites with
> links to internet sources, and themselves interacting with such sources (as
> they will with open archives) without needing to act via the library. Our
> role as librarians is to keep pace with these changes and evolve new methods
> for providing not only 'permanence and stability', but also description and
> classification to ensure that sources are findable by other researchers,
> students and teachers. So - to take Greg's point about centralisation -
> whether an institution wishes to create an open archive for itself as an
> institution, or whether a single department wishes to do it, is a matter for
> them to 

Re: Central vs. Distributed Archives

2000-11-08 Thread Thomas Bacher
This is what University Presses need to become -- the formatter, keeper, and
distributor (with the university library) of the intellectual goods. If that
were to happen, funded of course by the university, then the university
could avoid paying twice (once to the researcher and twice to the publisher)
for intellectual property. The university would also save money in the long
term. I believe it will come to this model within the next five years.

Thomas Bacher, Director, Purdue Press
1207 SCC-E, W. Lafayette, IN 47907-1207
(765)494-2038   Fax: (765)496-2442
www.thepress.purdue.edu

Be at your life-long-learning best. Read from a University Press.


Re: Central vs. Distributed Archives

2000-11-08 Thread John MacColl
Greg Kuperberg wrote:

> > So should we mathematicians trust individual math departments to
> > permanently preserve their e-prints? I don't think so. Our own math
> > preprint series at UC Davis is an arXiv overlay - all articles are
> > automatically contributed to the math arXiv. One of my
> arguments for this
> > arrangement is that we can't promise to babysit these preprints forever.
> > We could easily forget our obligation.

Stevan Harnad replied:
>
> The Department could easily forget; the institutional library is unlikely
> to do so. It has a lot of prior practise with stability/permanence! (And
> it has a good deal to gain from maintaining robust institutional Eprint
> Archives: The prospects of serials-crisis relief, as other
> institutional libraries do the same thing, with their own Eprint
> archives --

I would concur with this response, and would wish to develop a couple of
points about why libraries are important in the freed literature scenario.
Interestingly, the notion of  'forgetting' gives a new dimension to the
notion of libraries as 'memory organisations'. They are no longer simply
memory organisations in the sense of storing knowledge, as in a memory, but
particularly as that knowledge becomes networked they are becoming
organisers of access, for which function their contribution to their parent
institution is to understand information structures, sources and
presentations. This requires that they are memory joggers as well as memory
fillers. That has always been true, but internet publication has increased
both the complexity of these structures, and the rate of publication. More
and more the challenge for academic libraries is to preserve the roles of
hunters and collectors of knowledge in the age of internet publishing: that
requires that they take a much more active approach to identifying and
maintaining knowledge than was required in the age of print, when libraries
had adapted to the culture of publishers, and had settled into a role which
was primarily passive.

But as Stevan says, interoperability in the world of eprint archives has not
been tried before (and therefore cannot be criticised as the wrong model).
More than that, it is at present the only model really capable of surviving
in the world of internet publishing, and it conforms to the way librarians
see publishing culture moving, which is why the library profession is so
concerned with metadata - the key to the knowledge structures which are in
transition. In the passive model, academics and researchers ordered books
and journals via the library, and the library sought to ensure that the
material which arrived in the form of physical product was organised
optimally. Now, we find academics and researchers creating web sites with
links to internet sources, and themselves interacting with such sources (as
they will with open archives) without needing to act via the library. Our
role as librarians is to keep pace with these changes and evolve new methods
for providing not only 'permanence and stability', but also description and
classification to ensure that sources are findable by other researchers,
students and teachers. So - to take Greg's point about centralisation -
whether an institution wishes to create an open archive for itself as an
institution, or whether a single department wishes to do it, is a matter for
them to decide, but either way it is in their interest to let the library
know that the archive exists, as a knowledge source to which access is
required.

And the reason libraries are so important to the argument for freeing the
research literature is because they spend large sums of their institutions'
cash. A freed research literature will reduce that outlay very considerably.
And quite apart from the benefits that will bring - to the library as well
as to other parts of the institution - it will result in a new 'value for
money' standard for the purchase of research literature, appropriate to what
Stevan calls the 'post-Gutenberg' age, replacing the economically absurd
current situation. What should research literature cost,  now that print has
become merely a (deluxe!) option? The library is by far the best-placed
department of the institution to oversee the transition to that new
standard.

John

-
John MacColl
Sub-Librarian, Online Services   http://www.lib.ed.ac.uk
SELLIC Directorhttp://www.sellic.ed.ac.uk
Science & Engineering Library, Learning & Information Centre
University of Edinburgh Tel: 0131 650 7275
Darwin Library  Mobile: 07808 170075
The King's Buildings   Fax: 0131 650 6702
Edinburgh EH9 3JU john.macc...@ed.ac.uk


Re: Central vs. Distributed Archives

2000-11-07 Thread Stevan Harnad
On Mon, 6 Nov 2000, Greg Kuperberg wrote:

> After all, Stevan, suppose that we told you that CogPrints would be better
> off as part of the arXiv and you should surrender your collection and
> your responsibilities.  Would you immediately agree, or would you want
> some time to think about it?

I've already thought about it: CogPrints was originally designed with
subsumption under arXiv (then XXX) in mind. The goal was not to win
fame and fortune as an archivist, but to free the refereed literature,
in all disciplines.

ArXiv had demonstrated the viability of centralized self-archiving in
Physics, and CogPrints was intended to generalize this viability to
other disciplines. Once generality was demonstrated, I could see no
reason why all the disciplinary archives should not just be subsumed by
arXiv: After all (to repeat), the goal was not to promote archives or
archivists, but to free the refereed literature through
self-archiving.

But I had been hedging my bets all along. Apart from advocating
arXiv-style centralized self-archiving, I had also been advocating
distributed self-archiving. In fact, that was the gist of my 1994
"subversive proposal."
http://www.arl.org/sc/subversive/

Now it seems to me that CogPrints, with under 1000 papers after three
years is still lagging behind arXiv, with 130,000 after 11 years. And
even arXiv is still only growing linearly.

So perhaps the centralized approach could use some help, to get the
growth into the exponential range, across disciplines. Enter the
Eprints software, an OAI-compliant adaptation of the CogPrints
software, free for adoption by all universities, so they can
immediately establish interoperable Eprint Archives for all their
researchers, in all disciplines, to self-archive all their refereed
papers in, now.

With interoperability, it is no longer necessary to worry about which
archive the paper is in, or where; nor about whether the archive is a
centralized disciplinary one or a distributed institutional one. It is
no longer a matter of one archive subsuming another: They are all
seamlessly harvested into a global "virtual" archive, on every
researcher's desktop, and "containing" the entire refereed literature
-- just as, say, the ISI's searchable database contains all the titles
and abstracts across all disciplines, except that the full text will be
there too (and free).

So the answer is: Sure I'd have been happy to have CogPrints subsumed
by arXiv if that had proved to be the way to get the entire refereed
corpus online and free. But now it looks as if OAI-compliant
distributed Eprint Archiving (including arXiv) will instead be
"subsumed" into the global virtual Eprint Archive.

For that: immediate agreement, with no need for afterthoughts!

> Some might ask, what is there to decide about how to run an archive?
> For example, the arXiv's policy is that DVI is unreliable as an input
> format, although it does offer it as output.  The arXiv requires TeX
> source for new submissions if they are written in TeX.  There are other
> subject-based archives out there that accept *only* DVI as a submission
> format.  The maintainers of these archives feel that TeX source is an
> unreliable input format, and moreover that TeX source is confidential
> for some authors.  It is very difficult to defuse this seemingly minor
> issue, and it is only one of several such issues.

This is a paradigmatic example of Zeno's Paralysis: We sit here fussing
over whether it should all be DVI or TeX source, and most of the
literature is still sitting, waiting, on-paper, and on-disk, unarchived.

The Eprints solution is to accept all formats, as long as at least one
of them is immediately screen-readable: http://www.eprints.org

Get the stuff up there, demonstrate the power of self-archiving to free
the refereed literature today, irreversibly addict everyone to it, and
THEN worry about optimizing formats thereafter.

> For institutional preprint series the issues are a little different,
> but they are equally obstructive.  Usually an institutional maintainer
> is less interested in retaining credit, but more concerned, sometimes
> correctly, about following his mandate.  If we suggest to university
> U that they contribute their papers to the arXiv, the maintainer at U
> may say "our faculty gave permission for me to list their papers in our
> preprint series, but not to contribute them to your arXiv."  That can
> lead to yet another bureaucratic thicket.

Moot all of this by just having all universities self-archive their own
stuff in their own interoperable Eprint Archives. Interoperability and
harvesting will take care of the rest.

> Right behind these superficial issues are more significant ones like
> permanence.  The fact is that many institutional and subject-based
> archives do not want the responsibility of permanence.  Some of them
> explicitly repudiate it.  A standards-based virtual archive approach,
> such as OAI, aspires to please every side and sweep all such 

Re: Central vs. Distributed Archives

2000-11-07 Thread Greg Kuperberg
On Tue, Nov 07, 2000 at 03:15:36PM +, Stevan Harnad wrote:
> So the answer is: Sure I'd have been happy to have CogPrints subsumed
> by arXiv if that had proved to be the way to get the entire refereed
> corpus online and free. But now it looks as if OAI-compliant
> distributed Eprint Archiving (including arXiv) will instead be
> "subsumed" into the global virtual Eprint Archive.

I have learned not to claim that the arXiv is the Philosopher's Stone,
much as I would like it to be.  But if you're serious about merging
with the arXiv, let's see how well OAI is doing in a year, as measured
by the number of search queries at multiarchive OAS agents.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-06 Thread Stevan Harnad
On Fri, 3 Nov 2000, Greg Kuperberg wrote:

> It is not really a neutral statement to declare that it no longer
> matters whether a paper is in a central archive or a distributed one.
> Each archive is in a way an entrenched interest.  Each archive maintainer
> has put a lot of work into his or her project, and therefore wouldn't
> want it assimilated into a larger archive without a very good reason.

I am afraid I cannot follow this at all. Are you saying that the
"maintainer" of a free public archive of refereed research has an
interest in NOT having that research "assimilated" into still larger
public archives, if it increases their visibility, accessibility and
impact?

(If there really do exist such "entrenched" archive-maintainer
interests, they begin to resemble the conflict of interest that has
emerged between researchers and journal publishers, when it comes to
access-barriers to their work!)

The maintainers I have in mind are those whose interest is in freeing
this research from needless access/impact barriers, not in adding to
them!

In particular, neither universities who provide distributed
institutional Eprint Archives for self-archiving the refereed research
of their researchers, nor Learned Societies who do so for the sake of
their disciplines, in a centralized archive, have anything to gain from
preventing their respective archive contents from being harvested by
Open Archive Services into still larger "virtual" archives, all
seamlessly interoperable (e.g., http://arc.cs.odu.edu/).

As to justifying access-barriers on the grounds that the archive
maintainer "has put a lot of work into his or her project," the Eprints
software should now make that work so minimal that this dubious
rationale becomes moot anyway: http://www.eprints.org

> This is overconfidence.  The biggest reason that it is overconfidence
> is that it defers the permanence question.  But there are other reasons
> as well.  One is that one of the most useful features of the arXiv
> (and similar services such as CogPrints) is immediate notification of
> new results.

There is no (not-readily-solvable) "permanence question." At this
point, getting the literature on-line and free is the most important
thing to do, now. The collective interests that this will generate in
KEEPING it all on-line and free will ensure that all proper steps are
taken to ensure permanence.

The OAI-compliant archive-creating/maintaining Eprints software has the
same notification service as CogPrints -- indeed, it is a generic
adaptation of the CogPrints software!
http://cogprints.soton.ac.uk

> Another is non-redundancy: the arXiv almost completely
> eliminates the disarray of having many copies of a paper which may
> or may not be different versions.  The OAI standard does not address,
> and perhaps cannot address, either of these important advantages of a
> centralized system.

The OAI-standard has not yet addressed version control (it will) but
the OAI-compliant Eprints Software has. Moreover, version-sorting is
a natural function for an Open Archives Service that harvests all
versions of a paper, and sorts them the way you like (date, archive,
use, etc.) Such a service is a natural one to go hand in hand with
citation-linking (which likewise has to sort versions):
http://opcit.eprints.org

> interoperability keeps getting reinvented.

The OAI protocol is steadily being optimized (and the OAI-compliant
Archives with it): Is this a bad thing?

> Precedent suggests that if OAI succeeds, it will fade into a
> transparent layer, and that beyond it people will see incompatability
> at a new level and invent another standard.

This sounds unduly pessimistic (and could be said against any attempt
to create interoperability standards).

> HTTP is already an interoperability standard, originally invented for
> the purpose of distributing research documents.
> And there are already HTTP-based search engines, including CiteSeer,
> which searches only for research papers.  So it's important explain how
> OAI would go beyond HTTP+CiteSeer.

I suggest that this question be re-directed to the OAI discussion list,
which is concerned with the technical details: u...@vole.lanl.gov
http://vole.lanl.gov/pipermail/ups/


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list 

Re: Central vs. Distributed Archives

2000-11-06 Thread Greg Kuperberg
On Mon, Nov 06, 2000 at 05:46:57PM +, Stevan Harnad wrote:
> I am afraid I cannot follow this at all. Are you saying that the
> "maintainer" of a free public archive of refereed research has an
> interest in NOT having that research "assimilated" into still larger
> public archives, if it increases their visibility, accessibility and
> impact?

My position is borne entirely out of practical experience and not
theory, and I am not saying exactly that.  For a subject-based archive
(as opposed to institutional), the maintainer has an interest in retaining
credit for his efforts. He may also have at least a perceived interest in
retaining control over the archival procedures.  If an outside archive is
assimilated into the huge arXiv, certainly it increases the visibility,
accessibility, you-name-it-ability, of the individual papers.  However the
former maintainer's name may well fade into the background.  At best
asking a maintainer to merge with the arXiv is asking him to change his
duties (if he stays on as an arXiv moderator or an overlay maintainer).
At worst it's asking him to retire.  The math advisory committee has had
dozens of negotiations to merge material into the arXiv.  We consider
all such negotations to be delicate.

After all, Stevan, suppose that we told you that CogPrints would be better
off as part of the arXiv and you should surrender your collection and
your responsibilities.  Would you immediately agree, or would you want
some time to think about it?

Some might ask, what is there to decide about how to run an archive?
For example, the arXiv's policy is that DVI is unreliable as an input
format, although it does offer it as output.  The arXiv requires TeX
source for new submissions if they are written in TeX.  There are other
subject-based archives out there that accept *only* DVI as a submission
format.  The maintainers of these archives feel that TeX source is an
unreliable input format, and moreover that TeX source is confidential
for some authors.  It is very difficult to defuse this seemingly minor
issue, and it is only one of several such issues.

For institutional preprint series the issues are a little different,
but they are equally obstructive.  Usually an institutional maintainer
is less interested in retaining credit, but more concerned, sometimes
correctly, about following his mandate.  If we suggest to university
U that they contribute their papers to the arXiv, the maintainer at U
may say "our faculty gave permission for me to list their papers in our
preprint series, but not to contribute them to your arXiv."  That can
lead to yet another bureaucratic thicket.

Right behind these superficial issues are more significant ones like
permanence.  The fact is that many institutional and subject-based
archives do not want the responsibility of permanence.  Some of them
explicitly repudiate it.  A standards-based virtual archive approach,
such as OAI, aspires to please every side and sweep all such issues under
the rug.  I wonder if this is rushing in where angels fear to tread.

> There is no (not-readily-solvable) "permanence question." At this
> point, getting the literature on-line and free is the most important
> thing to do, now. The collective interests that this will generate in
> KEEPING it all on-line and free will ensure that all proper steps are
> taken to ensure permanence.

Again, experience tells me otherwise.  Thousands of math preprints have
come and gone on the web.  Let me also give you a quote from a help page
of a non-arXiv math archive:

When your paper is ultimately published we would greatly appreciate
being informed. At that time we will remove the preprint and leave
a pointer to the journal in which it was published.

This flatly contradicts your vision of "freeing the literature".  But OAI
itself does not pass judgement on such policies.

> The OAI-compliant archive-creating/maintaining Eprints software has the
> same notification service as CogPrints -- indeed, it is a generic
> adaptation of the CogPrints software!

Yes, but it *only* notifies the subscribers of that one little archive.
The OAI standard leaves OAS agents with no clear notification mechanism,
because there is no guarantee that the agent will be notified in a
timely manner by the foundational archives.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-05 Thread David Goodman
Has anyone actually done a quantitative study, either theoretical or
experimental, on the optimum number and size of eprint archives?

David Goodman, Princeton University Biology Library
dgood...@princeton.edu609-258-3235


Re: Central vs. Distributed Archives

2000-11-03 Thread Greg Kuperberg
On Fri, Nov 03, 2000 at 08:24:44AM +, Stevan Harnad wrote:
> But why restrict efforts to centralized ones only? The whole point of
> OAI interoperability is that it should no longer make any difference
> whether a refereed paper is archived in a central archive or a
> distributed archive or both! (The only alternative we want to avoid is
> "neither"!)

It is not really a neutral statement to declare that it no longer
matters whether a paper is in a central archive or a distributed one.
Each archive is in a way an entrenched interest.  Each archive maintainer
has put a lot of work into his or her project, and therefore wouldn't
want it assimilated into a larger archive without a very good reason.
So saying that it no longer matters whether it is centralized or
distributed is like saying that it no longer matters whether states
answer to Washington.

This is overconfidence.  The biggest reason that it is overconfidence
is that it defers the permanence question.  But there are other reasons
as well.  One is that one of the most useful features of the arXiv
(and similar services such as CogPrints) is immediate notification of
new results.  Another is non-redundancy: the arXiv almost completely
eliminates the disarray of having many copies of a paper which may
or may not be different versions.  The OAI standard does not address,
and perhaps cannot address, either of these important advantages of a
centralized system.

A more balanced point of view would be to recognize that while a
standards-based distributed system may be much better than anarchy,
it doesn't finish the job.

I also note that interoperability keeps getting reinvented.  Precedent
suggests that if OAI succeeds, it will fade into a transparent layer,
and that beyond it people will see incompatability at a new level and
invent another standard.  HTTP is already an interoperability standard,
originally invented for the purpose of distributing research documents.
And there are already HTTP-based search engines, including CiteSeer,
which searches only for research papers.  So it's important explain how
OAI would go beyond HTTP+CiteSeer.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-03 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

> We have had much more success by moving in the opposite direction,
> i.e., by strengthening distributed open archival with a centralized
> foundation.

And continued good success to the math arXiv project!

But why restrict efforts to centralized ones only? The whole point of
OAI interoperability is that it should no longer make any difference
whether a refereed paper is archived in a central archive or a
distributed archive or both! (The only alternative we want to avoid is
"neither"!)

By way of example of how it no longer makes any difference, CogPrints
 is a centralized archive for
cognitive science -- but it is using EXACTLY the same OAI-compliant
Eprints architecture as is has been developed for distributed,
institution-based archiving by http://www.eprints.org. In fact, the
OAI-compliant Eprints software was DERIVED from the prior centralized
CogPrints software!

And institutions are institutions, whether they mount centralized
archives or institutional archives.

And mirroring and harvesting for reliability and permanence are
available to both.

So why keep repeating that centralized archiving helped accelerate math
archiving more quickly than the prior (pre-OAI) distributed archiving?
True, but things didn't stop there. And linear growth is still linear
growth, whereas what we need is exponential growth, across all
disciplines, if we are to reach the optimal and inevitable before we
expire!

So let 1000 flowers bloom, central and distributed. Interoperability
will harvest them all.

> The MPRESS project (http://mathnet.preprints.org/)
> has a lot in common with OAI, and it was started before the universal
> math arXiv.  It has its own metadata standard, "Dublin Core", and its
> has a number of institutional preprint series among its data feeds.
> But it hasn't yet caught on.

Maybe that was because it was going it alone, instead of distributing
its efforts across disciplines, as the Open Archives Insitiative is
doing. It's one thing to adopt a standard, quite another to get others
to adopt it too.

(This is why your advocacy of centralized archiving and anti-advocacy
of distributed archiving is divisive and counterproductive: We should
be supporting every effort that gets all the refereed literature up
there, online, accessible, searchable, navigable, and free for all.
Centralized archiving has not managed this alone, so let it now benefit
from the help of Distributed Archiving!)

> It doesn't seem to make much difference to
> authors whether a preprint series is indexed by MPRESS or not.

I don't understand this point. It may be another symptom of the
conflation between publishing and archiving, and between preprints and
postprints: What authors are choosing when they PUBLISH a paper, is a
journal, i.e., a quality-certifier with a known level of quality, a
trusted "brand." What authors are choosing when they ARCHIVE their
eprint -- whether the journal-certified, refereed POSTprint or the
unrefereed PREprint -- is a means of making their paper maximally
visible and accessible online, for free for all. OAI-interoperability
provides that, provided the metadata-protocol is shared by all
archives, irrespective of whether they are centralized or
institutional.

MPRESS apparently did not become such a universal (we might even call it
"distributed") standard. Perhaps this was in part because it did not
inititially adopt OAI's strategy of minimalism: Pick the minimal
functional metadata set, to maximize the ease of compliance, rather than
going all the way to Dublin Core from the outset. (OAI is inching
towards Dublin Core too, but thanks to minimalism and proselytising
across disciplines, it may manage to bring everyone else along with
it.)

> Part of
> the trouble with MPRESS is that not all of its sources are providing
> as good metadata as they promised.  Ironically the lion's share of good
> metadata in MPRESS comes from the math arXiv.
>
> I would like to know where OAI thinks that MPRESS went wrong.  In fact
> since I maintain a "service provider" for the math arXiv, I looked into
> using OA-compliant metadata instead of the ad hoc metadata that I get from
> the arXiv.  I discovered that the OA standard is an oversimplification
> of the full arXiv metadata record, to the point that I can't use the
> OA format.

I will have to leave this to OAI experts to reply to.

> But don't get me wrong.  I am in favor of fragmented interoperability if
> you really can't hope for something better.  And as I said, the overall
> STM literature might well have to be fragmented, for now, down to the
> level of individual disciplines (e.g. chemistry) or small groups of
> disciplines (physics+math+cs).

"Fragmented interoperability" is a tautology": The whole point of
interoperability is shared metadata standards unifying distributed
("fragmented") systems.

As to "hopes": The only pertinent hope is the freeing of the entire
refereed literature

Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 10:08:09PM +, Steve Hitchcock wrote:
> NCSTRL was effectively the model for OAi. Greg Kuperberg suggests that
> NCSTRL has not been successful.

I don't want to disparage a project as big and difficult as NCSTRL.
It has had some success.  It's important.  But I don't think that it's
nearly as successful as the arXiv.  I guess I said something stronger
before, that NCSTRL is not as heavily read as the math arXiv, which
is much smaller than the whole arXiv system.  Well possibly I'm wrong
on that.  But I note that the math arXiv is just as heavily read on a
per-paper basis as the larger parent arXiv system.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Steve Hitchcock

At 21:29 02/11/00 +, Stevan Harnad wrote:

> Obviously I'm not a conservative offering rationales for inaction.
> And my worry is not "a priori". NCSTRL and MPRESS are two long-standing
> attempts at standards-based fragmented interoperability. Neither one
> has as much readership as the younger, fully integrated math arXiv.

They pre-dated OAI and Eprints. Have just a bit more patience; but be
prepared to set aside prior prejudices or you will obstruct precisely
what we both want to facilitate!


NCSTRL was effectively the model for OAi. Greg Kuperberg suggests that
NCSTRL has not been successful. It would be useful to have some meaningful
measure of whether NCSTRL has been successful or not, and to hear the views
of the NCSTRL developers (who are also involved in OAi). Maybe real
evidence will yield clues to the ultimate destiny of OAi - central or
distributed.

The Harnad-Kuperberg dialogue has been fascinating but, to my mind, hasn't
resolved the issue conclusively. It will be critical to understand what the
user wants.

Steve


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Fri, 3 Nov 2000, Stuart A Yeates wrote:

> So if I hear you correctly OAI will have no traffic with technical reports or
> technical report servers? these _are_ vanity press.

Incorrect. Eprints Archives are for both unrefereed preprints and
refereed postprints, suitably tagged as such.

Stevan Harnad


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
I like Greg Kuperberg's postings, even though we disagree. Greg too is an
advocate of freeing the literature through author self-archiving, but he
prefers centralized archives, whereas I think both centralized and
distributed archiving are welcome and should be encouraged, as both can
hasten the freeing of the refereed literature.

Centralized archiving has been with us for over 10 years, and at its
current rates it will take 10 more years to free the Physics literature
alone, where it is most advanced. In Greg's own field of mathematics,
it might be going even more slowly. It looks to me as if centralized
self-archiving can now use the help of distributed institutional
self-archiving.

By way of counterevidence, Greg cites the fact that in mathematics
institutional self-archiving predated centralized self-archiving
and was unreliable. It was centralized self-archiving that accelerated
and stabilized the process.

What Greg seems to overlook is that the institutional self-archiving he
describes PRE-DATED the Open Archives Initiative (OAI), with its
interoperability. Hence the question of whether or not distributed
self-archiving in OAI-compliant Institutional Eprint Archives will
accelerate the freeing of the literature has not yet been tested.

Greg also seems to conflate, at some junctures, the self-archiving of
unrefereed preprints with the self-archiving of refereed postprints,
as if self-archiving were in some sense a rival to or substitute for
refereed publication (which I certainly do not think it is);
self-archiving is merely a way to free the refereed literature.

On Thu, 2 Nov 2000, Greg Kuperberg wrote:

> In 1997, the year before the universal math arXiv was started, there
> were already some 10 or 20 thousand research papers freely available on
> the web. Most of them were on personal home pages, but thousands were
> in institutional and subject-based preprint series.

This is irrelevant, as noted above. These archives were not
OAI-compliant and hence could not be integrated or navigated in a
useful way.

> Nonetheless the vast majority of these papers were still eventually
> sold as published papers.

This too is irrelevant. The initiative to free the refereed literature
is a PRO-RESEARCHER and PRO-RESEARCH initiative, not an anti-publisher
initiative (nor even particularly a "pro-library" initiative):

The goal is to free the refereed literature for one and all online.
That is what self-archiving does.

The goal is NOT to prevent other versions of the refereed literature
from being sold, on-paper or on-line, if there is a market for them.
(Why would we want to do that?)

> So what were the publishers selling? Not peer review, because you
> can learn from Math Reviews where a paper has been published without
> subscribing to the journal. To a large extent the journal system was
> selling, and is still selling, stability and permanence.

Fine. Let it continue to do so (whether the stability/permanence is real
or merely imagined). As long as another version is online and free, the
goal is met.

> So that has been the fundamental question of open archival in
> mathematics for years. That is why some of the recalcitrant math
> publishers say that the arXiv is "just a preprint server" and not a
> "permanent e-print archive". Of course I don't agree with them; I
> choose the arXiv over subscription journals as the future route to
> permanent archival.

I'm afraid that this is not making sense to me. What is the argument?
That the jeering of some publishers nullifies the fact that that portion
of the refereed literature that has been freed is indeed free?

The substantive question is: Are the refereed papers online and free? If
they are, who cares if some people keep calling them "prepints," when in
reality they include both, pre-refereeing preprints + post-refereeing
postprints (= eprints)?

But I sense another point of disagreement with Greg: Earlier he said
it's not the peer-review that makes people keep paying for the for-fee
(refereed) version despite the availability of the for-free (refereed)
version, but the "stability and permanence". Perhaps. But if the
implementation of the peer-review were no longer paid for by the
continued support for the publishers' version, perhaps the true value
and causal role of peer-review in all of this would become clearer.

Moreover, for now, it is not true stability/permanence that
distinguishes the publishers' for-fee version and the archives'
for-free version, but mere PERCEIVED stability/permanence.

With time, that may change. But for now it certainly isn't any reason to
deter us from self-archiving, either centrally or institutionally. On
the contrary; as long as the publishers' for-fee version is seen as the
guarantor of the stability/permanence, there is no reason whatever NOT
to SUPPLEMENT that with the self-archived free version -- without giving
the stability/permanence issue another thought!

> As a practical matter most of the institutional prepri

Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Fri, 3 Nov 2000, Stuart A Yeates wrote:

> > (3) The goal is to free the refereed literature, across
> > disciplines, now. Once the literature is thus freed the
> > process will be irreversible.
>
> Do you mean free as in liberty or free as in free beer ?
>
> This particular bone of contention has effectively split what used to be be
> known as a free software movement, but is now known as the free software/open
> source movement.

Free in the way advertisements are free (which I suppose is more like
free beer -- when you're giving away your own home-brew).

But this refereed brew is definitely not free in the sense of "liberty"
(that would be the vanity press). It is constrained by and answerable to
peer review. Hence it is not relevantly like software either.

But once it successfully passes that quality-control process, and is
certified as such, the author can and should maximize the access to,
and hence the impact of this give-away refereed research by
self-archiving it online, free for all.

http://www.arl.org/sc/subversive/


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM


Re: Central vs. Distributed Archives

2000-11-02 Thread Stuart A Yeates
Stevan Harnad wrote:

> (3) The goal is to free the refereed literature, across
> disciplines, now. Once the literature is thus freed the
> process will be irreversible.

Do you mean free as in liberty or free as in free beer ?

This particular bone of contention has effectively split what used to be be
known as a free software movement, but is now known as the free software/open
source movement.


--stuart yeates 


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

> > what gives you the impression that
> > this Forum is trying to prevent companies from doing whatever they
> > like?
>
> What you said originally was:
>
>sh> The Elsevier policy of publicly archiving pre-refereeing preprints
>sh> could be a good first step towards the optimal and inevitable, but it
>sh> is also possible that it is intended as a Trojan Horse,...
>
> I think it's divisive to speculate that someone else's e-print archive is
> a Trojan Horse.  It's true that I'm not sure that the CPS is compatible
> with Elsevier's mission of maximizing profit.  But let's give it the
> benefit of the doubt.

Good. Both sides of the question have been aired.

(Please distinguish my actions as moderator, when I invoke cloture,
from the expression of my own views on this topic -- which carry no
more weight then anyone else's ex officio.)

Stevan Harnad


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

> I certainly think that a standard for interoperability could be useful,
> but it is wishful thinking to suppose that it can tame an anarchy of many
> tiny little e-print archives. In my discipline, when the literature
> is excessively decentralized, as it was entirely before 1998 and still
> largely is, neither authors nor readers have any confidence that papers
> floating around on the Net are permanent. And they are right, because
> no one could promise to keep those papers forever with any credibility.
> ... The fact that the arXiv is so large and so widely used and
> mirrored is a necessary ingredient for assuring permanence.

(1) Archives meeting the conditions to be registered OAI-compliant
data-providers  are
not likely to be "tiny little" ones (though it is no problem if some of
them are).

(2) Most Eprints Archives are likely to be university-based archives, for
all the university's refereed research, in all its disciplines. That's
hardly tiny (or impermanent) either.

(3) The goal is to free the refereed literature, across disciplines,
now. Once the literature is thus freed the process will be irreversible.

(4) The mechanisms for preserving and navigating it will continue to
evolve and improve, with the whole world's refereed assets in this
distributed basket (suitably mirrored, harvested, cached, backed up,
etc.).

(5) The immediate issue is hence not the PERMANENCE of the
self-archived drafts but their EXISTENCE, free for all, now. The
permanence will take care of itself.

> The "self" in self-archiving could mean individuals acting for themselves,
> or it could mean the research community acting for itself by directly
> supporting one or a few archives. I have the feeling that you don't
> see this as an important distinction.

You are right; I think it is a red herring. Most of the individuals in
question (the authors of the refereed literature) are researchers at
universities and research institutions. In principle each of them could
set up his own Eprints Archive and register it with the OAI (and that
would be fine as a start, and would free the literature irreversibly).

But of course the likely, practical strategy is for the researchers'
universities and research institutions (or, more specifically, their
libraries) to create and administer their institutional Eprint Archives
for all their researchers' refereed output, in all disciplines. (We can
have at least as abiding a faith in the durability of the collections
on universities' airwaves, then, as we now have in the durability of
the collections on their shelves).

> I can't say that this ambitious goal is "within immediate reach" in
> mathematics, because many of us have worked hard to make it happen and
> we see a lot of work ahead. We can't expect all mathematicians to change
> their minds in one day.

You are now talking about something else: You are talking about what it
will take to induce the research cavalry to drink, once they have been led
to the waters of self-archiving.

There's no second-guessing human nature, but my own hunch is that the
motivational structure at the researchers' own institution -- the one
that benefits from (and rewards) the impact of its own researchers'
refereed output, and the one that is today weighed down by the serials
crisis and the limitations that that puts on its own researchers'
access to the refereed output of researchers at other institutions --
may provide just the kind of local incentive for self-archiving that a
centralized, discipline based entity so far seems unable to provide.

In any case, these two routes to the liberation of the refereed corpus
(centralized and distributed) are complementary (and interoperable!).

> If you think that encouraging many small archives to spring up is the
> magic step, then I simply disagree. Because when we glued together
> many small archives into the math arXiv, the whole was much more than
> the sum of the parts. Even though the math arXiv has only 5% of new
> math papers, and even though it will take years for it to get to even
> 50%, it is at least growing more quickly than all of the Lilliputian
> mathematical archives put together.

I am not a mathematician, but this "whole is greater than the sum of its
parts" argument does not add up for me!

Centralized archiving in maths is at 5% and will take years to get to
50%. What possible reason would there be not to encourage complementing it
by institutional Eprint Archives immediately -- given that they will all be
co-harvested (and mirrored, and cached, etc.) in global virtual archives
anyway, thanks to interoperability?

> other disciplines are sufficiently different that their open archives
> might need separate administration. And that would lead to fragmentation,
> which concerns me more than it does you.

My concern is freeing the refereed literature online, now. There is no
reason it should stay hos

Re: Central vs. Distributed Archives

2000-11-02 Thread Michael L. Nelson
(note: I'm not sure this will get through all the aliases -- I don't think
this email addr is registered with the UPS list, for example)

On Thu, 2 Nov 2000, Steve Hitchcock wrote:

> NCSTRL was effectively the model for OAi. Greg Kuperberg suggests that
> NCSTRL has not been successful. It would be useful to have some meaningful
> measure of whether NCSTRL has been successful or not, and to hear the views
> of the NCSTRL developers (who are also involved in OAi). Maybe real
> evidence will yield clues to the ultimate destiny of OAi - central or
> distributed.
>

just a point of clarification:  NCSTRL was not directly the model for OAI,
at least architecturally.

OAI has more in common with:

- RePEc (http://www.repec.org/)
- SODA (http://www.dlib.org/dlib/march99/maly/03maly.html)

and similar architectures.

A subset of the Dienst protocol gave us a starting ground for defining a
harvesting protocol, but even that has been relaxed to allow Dienst and
OAI to progress independently.

Most OAI service providers will probably assume a distributed storage
model, because it is certainly easier to build.  But technically OAI is
agnostic with respect to centralized vs. distributed storage of data.
OAI focuses only on metadata.

Regarding centralized vs. distributed, I would submit CiteSeer

http://citeseer.nj.nec.com/cs

as an exemplary DL that seems to have resolved the tension between the two
models - providing both links to distributed copies and cached centralized
copies.

regards,

Michael

> The Harnad-Kuperberg dialogue has been fascinating but, to my mind, hasn't
> resolved the issue conclusively. It will be critical to understand what the
> user wants.
>
> Steve
>
>
> --
> UPS mail list
> Mail submissions to u...@vole.lanl.gov
> To subscribe or unsubscribe visit http://vole.lanl.gov/mailman/listinfo/ups
>

---
Michael L. Nelson
207 Manning Hall, School of Information and Library Science
University of North Carolinam...@ils.unc.edu
Chapel Hill, NC 27599   http://ils.unc.edu/~mln/
+1 919 966 5042 +1 919 962 8071 (f)


Re: Central vs. Distributed Archives

2000-11-02 Thread Stevan Harnad
On Thu, 2 Nov 2000, Greg Kuperberg wrote:

> 1) I have mixed feelings about the grass-roots connotations of the
> "Open Archives Initiative" and even more in Harnad's phrase
> "self-archiving".

You have to distinguish between the Open Archives Initiative (OAI) and
the "(Author/Institution) Self-Archiving (Sub-)Initiative."

OAI has now evolved into an initiative for shared standards and
interoperability in the metadata tagging of the contents of online
archives -- WHETHER OR NOT the contents (i.e., apart from the metadata)
of the archives are full-text or free: http://www.openarchives.org

A commercial publisher, for example, can establish an OAI-compliant
Open Archive as readily as any other institution or individual, and
would benefit from the increased visibility provided by the
OAI-compliant interoperability for the contents of the Archive, even if
the full-texts were kept behind an S/L/P financial firewall.

A journal publisher can also establish an OAI-compliant FREE Open
Archive, if they do wish to give away their full-text contents at this
time (as around 400 biomedical publishers are currently willing to do,
as indicated in a very recent posting:
http://www.freemedicaljournals.com
-- although most of those archives are not yet OAI-compliant).

Nor is the OAI particularly committed to either centralized,
discipline-based Open Archiving (e.g. ArXiv, CogPrints) or distributed,
institution-based Open Archiving (Eprints): It is developing
interoperability standards that apply to both, with the objective of
making the difference between them less significant, eventually perhaps
even irrelevant.

The (Author/Institution) Self-Archiving (Sub-)Initiative, however, is
SPECIFICALLY concerned with freeing the refereed research literature
through author/institution self-archiving (in OAI-compliant Open
Archives): http://www.eprints.org

> I do believe that the research literature should be
> electronic and free, and it is possible that each discipline must pass
> through an anarchic, do-it-yourself phase of open archival before
> moving on to a more organized stage.

It is not at all clear why you describe open archiving as "anarchic"!
It was precisely in order to put order into distributed online digital
archiving resources through interoperability that the OAI was
initiated!

And the other aspect of the order is the order already provided by the
refereed journals, in the form of peer review and its certification.
That order is medium-independent, and will be preserved in a
well-tagged Open Archive: "Journal-Name" will be a field, etc.

The only "do-it-yourself" issue is self-archiving itself. And the issue
is very clear: If researchers want the refereed literature freed, now,
then they can do it themselves, by self-archiving, now. Otherwise, they
have to wait until someone else (the journal publishers?) decides to
free it for them -- and that could prove to be a very long wait
indeed.

Harnad, S. (1999) Free at Last: The Future of Peer-Reviewed
Journals.  D-Lib Magazine 5(12) December 1999
http://www.dlib.org/dlib/december99/12harnad.html

> However, when I started archive work in mathematics, we already had an
> array of separate preprint servers cum e-print archives. The effort
> since then has been to reorganize much of this jumble into the math
> arXiv. Having many copies of one huge archive is superior to having
> many little archives, no matter how interoperable. Serious permanence
> and stability requires closer cooperation than that.

Again, it is a question of how long the researcher community is willing
to wait for the optimal and inevitable: It is now within immediate
reach to eliminate all the research access/impact-barriers, now,
through self-archiving. Interoperability will integrate the results
into a "global" Archive of the entire refereed research literature, in
all disciplines, as searchable as the Institute for Scientific
Information's Current Contents Database -- but including the full-texts
themselves (and free). (See ARC as a prototype and fore-taste of this
capability:  http://arc.cs.odu.edu/)

But note that arXiv-style centralized, discipline-based self-archiving
in Physics, the most advanced self-archiving on the planet -- with
130,000 archived paper in 10 years -- has only freed 30-40% of the
Physics literature so far, and will take 10 more years to free it all
at the present steady linear growth rate:
http://arXiv.org/cgi-bin/show_monthly_submissions

Note that I used to cite the above graph repeatedly as evidence that
the self-archiving cup is half-full. But it is also evidence that it is
still half-empty -- and taking another 10 years to fill.

So the idea is that distributed, pan-disciplinary, institution-based
self-archiving (OAI-compliant, of course) may be what is needed to get
this growth rate into the exponential range for Physics, as well as to
carry it over into all the other disciplines.

Of course multiple copies and mirroring (and harvesting and caching)
w

Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 09:29:24PM +, Stevan Harnad wrote:
> Centralized archiving has been with us for over 10 years, and at its
> current rates it will take 10 more years to free the Physics literature
> alone, where it is most advanced. In Greg's own field of mathematics,
> it might be going even more slowly. It looks to me as if centralized
> self-archiving can now use the help of distributed institutional
> self-archiving.

Actually the main difference in math is that we in effect started
later than physics did.  Part of the reason for that is that some of
the mathematicians involved, including me but not mainly me by any
means, instead devoted effort to "umbrella archive" projects (i.e.,
"global virtual archives") that ultimately failed.  We have had much
more success by moving in the opposite direction, i.e., by strengthening
distributed open archival with a centralized foundation.

> What Greg seems to overlook is that the institutional self-archiving he
> describes PRE-DATED the Open Archives Initiative (OAI), with its
> interoperability.

This is partly untrue.  The MPRESS project (http://mathnet.preprints.org/)
has a lot in common with OAI, and it was started before the universal
math arXiv.  It has its own metadata standard, "Dublin Core", and its
has a number of institutional preprint series among its data feeds.
But it hasn't yet caught on.  It doesn't seem to make much difference to
authors whether a preprint series is indexed by MPRESS or not.  Part of
the trouble with MPRESS is that not all of its sources are providing
as good metadata as they promised.  Ironically the lion's share of good
metadata in MPRESS comes from the math arXiv.

I would like to know where OAI thinks that MPRESS went wrong.  In fact
since I maintain a "service provider" for the math arXiv, I looked into
using OA-compliant metadata instead of the ad hoc metadata that I get from
the arXiv.  I discovered that the OA standard is an oversimplification
of the full arXiv metadata record, to the point that I can't use the
OA format.

But don't get me wrong.  I am in favor of fragmented interoperability if
you really can't hope for something better.  And as I said, the overall
STM literature might well have to be fragmented, for now, down to the
level of individual disciplines (e.g. chemistry) or small groups of
disciplines (physics+math+cs).
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 06:28:35PM +, Stevan Harnad wrote:
> (5) The immediate issue is hence not the PERMANENCE of the
> self-archived drafts but their EXISTENCE, free for all, now. The
> permanence will take care of itself.

That may be so in other disciplines, but it has not been true in
mathematics for several years.  In 1997, the year before the universal
math arXiv was started, there were already some 10 or 20 thousand research
papers freely available on the web.  Most of them were on personal
home pages, but thousands were in institutional and subject-based
preprint series.  Nonetheless the vast majority of these papers were
still eventually sold as published papers.

So what were the publishers selling?  Not peer review, because you
can learn from Math Reviews where a paper has been published without
subscribing to the journal.  To a large extent the journal system was
selling, and is still selling, stability and permanence.  So that has
been the fundamental question of open archival in mathematics for years.
That is why some of the recalcitrant math publishers say that the arXiv
is "just a preprint server" and not a "permanent e-print archive".
Of course I don't agree with them; I choose the arXiv over subscription
journals as the future route to permanent archival.

> But of course the likely, practical strategy is for the researchers'
> universities and research institutions (or, more specifically, their
> libraries) to create and administer their institutional Eprint Archives
> for all their researchers' refereed output, in all disciplines. (We can
> have at least as abiding a faith in the durability of the collections
> on universities' airwaves, then, as we now have in the durability of
> the collections on their shelves).

As a practical matter most of the institutional preprint series in
mathematics are at the department level.  At every university at which
I have studied or held an appointment, interdepartmental computer
services (a) are often mediocre, and (b) are often a one-size-fits-all
straightjacket.  I don't even like central campus e-mail.  In my view the
strength of university research is rooted in departmental independence.

So should we mathematicians trust individual math departments to
permanently preserve their e-prints?  I don't think so.  Our own math
preprint series at UC Davis is an arXiv overlay - all articles are
automatically contributed to the math arXiv.  One of my arguments for this
arrangement is that we can't promise to babysit these preprints forever.
We could easily forget our obligation.

> I am not a mathematician, but this "whole is greater than the sum of its
> parts" argument does not add up for me!

When we put together the universal math arXiv from its disparate parts,
submissions immediately jumped by 40% (as of December 1997).
Since then the math arXiv has grown more quickly than the subject-based
archives that were not pulled into the fold.  Take a look at the
submission statistics at my front end for the math arXiv:

http://front.math.ucdavis.edu/math

> 50%. What possible reason would there be not to encourage complementing it
> by institutional Eprint Archives immediately -- given that they will all be
> co-harvested (and mirrored, and cached, etc.) in global virtual archives
> anyway, thanks to interoperability?

As I said above, in math the institutional archives are there already (and
there are still a few separate subject-based archives).  They distract
authors as much as they encourage them.  In fact one of the serious
problems with the fragmented interoperability system is multiple
submissions.  Many authors like to advertise themselves by putting
their papers in more than one archive.  Or if a paper has four authors,
it could go to four archives because each one has a different favorite.

As for your vision of global virtual archives, that hasn't happened yet.
If you wait for that then you can't also assure us that the revolution
can take place immediately.  If we do have something to wait for, why
wait for a integrated facade with a fragmented foundation instead of
the other way around?

> A priori worries about distributed archiving alas belong to that long
> litany of prima facie rationales for inaction...

Obviously I'm not a conservative offering rationales for inaction.
And my worry is not "a priori".  NCSTRL and MPRESS are two long-standing
attempts at standards-based fragmented interoperability.  Neither one
has as much readership as the younger, fully integrated math arXiv.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On my other points:

On Thu, Nov 02, 2000 at 03:07:58PM +, Stevan Harnad wrote:
> I have, as moderator, terminated discussion on a few irrelevant or
> saturated topics (is there a conspiracy of university administrators to
> control researchers' intellectual property? is the library serials
> crisis simply a consequence of under-funding the libraries? how can we
> reform or abandon peer review?), but comments, whether supportive or
> critical, on the Forum's central theme -- "How to free the refereed
> literature online, now? -- have never been suppressed.

You may see it as closing discussion of all sides of a topic, but I see
some character of closing down just one side of a debate.  Obviously you
are referring to Al Henderson's argument that free scholarly communication
is a stress response to penny-pinching by university administrations.
I'll grant that he has said that many times, and I'll also grant that the
argument sounds absurd to me.  (I am one of the researchers supposedly
bullied by the administration, and if anything my complaint is that
the higher-ups are biased in favor of the historical subscription-based
system.)  But even though I don't agree with him at all, he is no more
repetitive than you are or I am.  Invoking cloture strikes me as an
overreaction.

> I couldn't agree with you more! But what gives you the impression that
> this Forum is trying to prevent companies from doing whatever they
> like?

What you said originally was:

   The Elsevier policy of publicly archiving pre-refereeing preprints
   could be a good first step towards the optimal and inevitable, but it
   is also possible that it is intended as a Trojan Horse,...

I think it's divisive to speculate that someone else's e-print archive is
a Trojan Horse.  It's true that I'm not sure that the CPS is compatible
with Elsevier's mission of maximizing profit.  But let's give it the
benefit of the doubt.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
I have been skimming the September98 forum on and off for a few months.
As a cursory Internet search will demonstrate, I strongly support
what I consider "the Ginsparg model", especially in my own discipline,
mathematics.  I would call it the arXiv model.  But while I agree in
outline with Stevan Harnad et al, I disagree in some of the details.
(And that's where the devil is.) Here is my take on three issues in
particular.

1) I have mixed feelings about the grass-roots connotations of the "Open
Archives Inititiative" and even more in Harnad's phrase "self-archiving".
I do believe that the research literature should be electronic and free,
and it is possible that each discipline must pass through an anarchic,
do-it-yourself phase of open archival before moving on to a more
organized stage.  However, when I started archive work in mathematics,
we already had an array of separate preprint servers cum e-print archives.
The effort since then has been to reorganize much of this jumble into the
math arXiv.  Having many copies of one huge archive is superior to having
many little archives, no matter how interoperable.  Serious permanence
and stability requires closer cooperation than that.

At the overall STM level the literature may have to be divided
into single-discipline or few-discipline fragments for some time.
The Los-Alamos based arXiv works well for the TeX-based e-print culture
in mathematics, physics, and parts of computer science.  But it is
not clear how to extend that particular system to the rest of science.
If you have to have disjoint archives, fragmented interoperability is
then a good goal to work towards.  But you have to realize that it is
only a partial solution.  And I have reservations about encouraging every
tenth researcher to set up yet another archive, because that can lead to
entrenched Lilliputian feifdoms of e-prints.  By my standards the physics
part of the arXiv, with 130,000 e-prints, is large; the math arXiv,
with 13,000, is medium-sized; and an archive with 1,300 or less is tiny.

2) I have been accused, sometimes correctly, of being overzealous in
my support of the arXiv.  I see that Stevan Harnad has about as much
enthusiasm as I do, and I can't criticize that.  But if the September98
forum has strong advocacy in favor of open archives, it doesn't make sense
to limit criticism.  Because then you're just preaching to the choir.
If you don't want to debate whether or not open archives are a good idea,
maybe that makes sense.  But then you shouldn't dwell on how fantastic
open archives are; instead you should steer the discussion to practical
plans.

3) I also can't criticize Elsevier's Chemistry Preprint Server project.
In a way I can't even criticize commercial publishers with high journal
prices, even though I believe that the mathematical literature should
be free.  A for-profit company is entitled to maximize profit.  If it is
publicly traded, it is legally required to do so up to a point.  (But the
same token, the customer, academia, is entitled to minimize expenses.)
I'm against Napster-style copyright infringement and I have mixed
feelings about journal boycotts.  My approach is less confrontational.
My own recent papers lie permanently in the arXiv, I keep the copyright,
and I will publish in any journal that wants the papers on those terms.

>From this point of view, I am not sure about the Chemistry Preprint
Server, because I don't see the business model for it.  But then, I
don't see the business model for Google either, and I think that Google
is great.  It is possible that the Chemistry Preprint Server will be
an important gift from Elsevier to the chemistry research community.
Arguably the chemists should have done it for themselves, but maybe they
lack leadership and need Elsevier to do it for them.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-02 Thread Greg Kuperberg
On Thu, Nov 02, 2000 at 03:07:58PM +, Stevan Harnad wrote:
> It is not at all clear why you describe open archiving as "anarchic"!
> It was precisely in order to put order into distributed online digital
> archiving resources through interoperability that the OAI was
> initiated!

I certainly think that a standard for interoperability could be useful,
but it is wishful thinking to suppose that it can tame an anarchy of many
tiny little e-print archives.  In my discipline, when the literature
is excessively decentralized, as it was entirely before 1998 and still
largely is, neither authors nor readers have any confidence that papers
floating around on the Net are permanent.  And they are right, because
no one could promise to keep those papers forever with any credibility.
Any given paper could be erased accidentally if it is in one tiny
archive somewhere.  Or maybe the maintainer of that particular archive
never explicitly promised permanence anyway; if so he could shut down
his archive when he gets tired.  The fact that the arXiv is so large
and so widely used and mirrored is a necessary ingredient for assuring
permanence.

> The only "do-it-yourself" issue is self-archiving itself. And the issue
> is very clear: If researchers want the refereed literature freed, now,
> then they can do it themselves, by self-archiving, now.

The "self" in self-archiving could mean individuals acting for themselves,
or it could mean the research community acting for itself by directly
supporting one or a few archives.  I have the feeling that you don't
see this as an important distinction.  I'll give you an analogy to show
you what I mean.  I use Linux, which an open, standards-based operating
system.  It would be absurd to call my use of Linux "self-programming",
even though Linux is maintained by some of its users.  I see the arXiv as
highly analogous to Linux.  This is why I am reluctant to use the phrase
"self-archiving".

> Again, it is a question of how long the researcher community is willing
> to wait for the optimal and inevitable: It is now within immediate
> reach to eliminate all the research access/impact-barriers, now,
> through self-archiving.

I can't say that this ambitious goal is "within immediate reach" in
mathematics, because many of us have worked hard to make it happen and
we see a lot of work ahead.  We can't expect all mathematicians to change
their minds in one day.  I have no desire to believe, as I once did,
that the exponential rocket is about blast off.

If you think that encouraging many small archives to spring up is the
magic step, then I simply disagree.  Because when we glued together
many small archives into the math arXiv, the whole was much more than
the sum of the parts.  Even though the math arXiv has only 5% of new
math papers, and even though it will take years for it to get to even
50%, it is at least growing more quickly than all of the Lilliputian
mathematical archives put together.

> > The Los-Alamos based arXiv works well for the TeX-based e-print culture
> > in mathematics, physics, and parts of computer science. But it is not
> > clear how to extend that particular system to the rest of science.
>
> Why? This formula has been repeated so many times that people are
> actually believing it, without anyone ever having explained why it
> should be thought to be true!

I don't mean to say that other disciplines can't have an open archive
that's *like* the arXiv.  I certainly think that they can.  I mean that
other disciplines are sufficiently different that their open archives
might need separate administration.  And that would lead to fragmentation,
which concerns me more than it does you.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

1999-06-29 Thread Stevan Harnad
On Tue, 29 Jun 1999, J.W.T.Smith wrote:

> I don't see what is 'papyrocentric' about... the idea of an item
> gaining some kudos by being in a certain archive...
> A similar situation occurs when a journal
> gains kudos from being indexed in a specific online biblographic database.
> No paper involved here.

Forget about indexing in databases. (If the primary journal publishers
need to think about what their new niche will be in the online world of
free, full-text self-archiving by authors, the secondaries and
tertiaries will unfortunately have more serious worries!)

The kudos comes from (P) the prestige (peer-review rigour, quality,
impact factor) of the journal that accepts the paper and (I) the impact
that it makes on research, in the form of further work citing and
building upon it. The potential impact will be made incomparably
greater by free online access for one and all.

Where is the papyrocentric thinking? In the thought that the paper's
"locus" on the Web is the source of the kudos (as the source of a paper
paper's kudos was the paper journal in which it appeared).

The accepting journal's imprimatur will shrink to a quality control
metadata tag, like a brand-name; the locus (virtual or real) of the
bytes will be of no consequence whatsoever.

Papyrocentric too is the idea that there is something to compete for in
being the "locus" of a paper. Nothing to sell, nothing to compete for.

> The subject specific archive seems an unnecessary complication. Applying
> Occam's razor it seems we can chop it off and the system can run happily
> without it.

The Los Alamos Archive has demonstrated that (at least in Physics), the
"centralized" end of the candle managed to free the literature before
the distributed end did. Occam says: Hedge your bets and do both:
Deposit in your local server AND the global one.

Harnad, S. (1998) On-Line Journals and Financial Fire-Walls. Nature
395: 127-128. http://www.ecs.soton.ac.uk/~harnad/nature.html

"All authors should continue to entrust their work to the paper
journals of their choice. But if, in addition, they were to
publicly archive their pre-refereeing preprints and then their
post-refereeing reprints on-line on their Home Servers, for free
for all, then the de facto practises of the reader community would
take care of the rest (irrespective of their reservations about
bed/bath/beach reading); library serial cancellations, the collapse
of the paper cardhouse, publisher perestroika, and a free for all,
e-only serial corpus financed by author-end page charges would soon
follow suit.

"A centralised variant of this subversion scenario,
http://xxx.lanl.gov, has already passed the point of no return in
Physics and some allied disciplines in the form of Paul Ginsparg's
(1994, 1996) U.S. NSF- (National Science Foundation) and DOE-
(Department of Energy) supported Physics Eprint Archive at Los
Alamos National Laboratory; as history will confirm, he
single-handedly set the world Learned Community on its inexorable
course toward the optimal and the inevitable in August 1991."

> js> Why don't you drop the word 'journal' then? Why not use 'validator' or
> js> some other word that indicates the role and doesn't carry over
> js> connotations from the old "papyrocentric" model?
> >
>sh> Suit yourself. But I think "Physical Review Letters" will continue to
>sh> prefer to call itself by its current familiar and trusted brand name --
>sh > and why on earth shouldn't it?
>
> I'm not saying we shouldn't have "Physical Review Letters" (or any other
> title) just that in the new model we should stop calling it a 'journal'.

Suit yourself. Maybe we should stop calling the contents "articles" too.
But what's the point?

> The problem with the word 'journal' is that it carries connotations from
> the "papyrocentric" world. For example - the idea that an item can only be
> in one 'journal'.

And a good connotation too! We have already gone round this one before:

Referees are a scarce and overworked resource. There is no justification
for asking anyone to referee an already-refereed, already-accepted
paper yet again, for acceptance yet again, elsewhere.

See the two reasonable sources of kudos above: (P) is acceptance by a
peer-reviewed Journal; (I) (and more important) is "acceptance" by
one's peers through the paper's impact on their reading, research and
citations. No more need for infinite rounds of peer reviewing and
re-reviewing. Otherwise it's like going back to school for more and more
exams instead of getting on with it! We haven't the time or the manpower
for such an orgy of endless assessment (even in the UK!).

> This does not need to be the case in a net-based model.
> Your descriptions of your model seem to contain a "papyrocentric"
> influence since there still seems to be a close relationship between an
> item and the 'journal' that validates it.

There is nothing papyrocentric about quality

Re: Central vs. Distributed Archives

1999-06-29 Thread J.W.T.Smith
Professor Harnad,

On Tue, 29 Jun 1999, Stevan Harnad wrote:

> On Tue, 29 Jun 1999, J.W.T.Smith wrote:
>
> > A monopoly in the sense that it could become 'the place' where readers
> > look for items relevant to their subject. The non-presence of an article
> > in a recognised subject specific archive could imply it is not relevant to
> > the subject. More on this later.
>
> Papyrocentric thinking. We live in the era of metadata tagging and
> search engines that trawl it all.

I don't see what is 'papyrocentric' about this since the idea of an item
gaining some kudos by being in a certain archive has no necessary
connection to the paper world. A similar situation occurs when a journal
gains kudos from being indexed in a specific online biblographic database.
No paper involved here.

> > Once an archive (or its mirrors) is seen as 'the place'
> > to search for items of interest and access to that archive can be
> > controlled it might be temping to place some restriction on access like
> > payment of a fee (for purely reasonable reasons like getting enough money
> > to maintain the archive).
>
> A lot of other networked services are likely to get a price tag before
> the tiny refereed literature archive is likely to: It is the flea on the
> tail of the dog, and we will all be best served if it is given a free
> ride. Again, this worry is papyrocentric and misplaced.
>

Again I don't see why this is 'papyrocentric'. It may be paranoid but it
is not 'papyrocentric' :-) .

> > Now I know the actual quality control/validation
> > is provided elsewhere (maybe by the 'old' journals, maybe by other
> > players) but from the point of view of the author they may also need to be
> > in the archive as well as have the validation/stamp of approval of an
> > external organisation.
>
> This sentence was a bit difficult to decode, but from what I can make of
> it, one entity (the established journals -- why on earth not?) can
> continue to do the quality controlling and certification-tagging, and
> another (new, virtual) one, the Archive, can provide free access to the
> texts.
>
> What is the problem?

The subject specific archive seems an unnecessary complication. Applying
Occam's razor it seems we can chop it off and the system can run happily
without it.

> > Why don't you drop the word 'journal' then? Why not use 'validator' or
> > some other word that indicates the role and doesn't carry over
> > connotations from the old "papyrocentric" model?
>
> Suit yourself. But I think "Physical Review Letters" will continue to
> prefer to call itself by its current familiar and trusted brand name --
> and why on earth shouldn't it?

I'm not saying we shouldn't have "Physical Review Letters" (or any other
title) just that in the new model we should stop calling it a 'journal'.
The problem with the word 'journal' is that it carries connotations from
the "papyrocentric" world. For example - the idea that an item can only be
in one 'journal'. This does not need to be the case in a net-based model.
Your descriptions of your model seem to contain a "papyrocentric"
influence since there still seems to be a close relationship between an
item and the 'journal' that validates it. There is no reason why an item
could not be validated by more than one validator - especially if it
crosses current subject boundaries.

John Smith,
University of Kent at Canterbury, UK.


Re: Central vs. Distributed Archives

1999-06-29 Thread Stevan Harnad
On Tue, 29 Jun 1999, J.W.T.Smith wrote:

> A monopoly in the sense that it could become 'the place' where readers
> look for items relevant to their subject. The non-presence of an article
> in a recognised subject specific archive could imply it is not relevant to
> the subject. More on this later.

Papyrocentric thinking. We live in the era of metadata tagging and
search engines that trawl it all.

> I am not concerned with its availability, I am
> concerned with the implied validation of the presence of an item in a
> given archive.

Don't be. The validator is the journal, as it always was. The Archive is
only the free cosmic bookshelf in the Sky...

> Even if the archive is mirrored it is a mirror of somewhere
> and the address of that somewhere has value. If this has no value why to
> we need an archive at all? Why don't we all mount our papers on our
> University servers?

We should! That was the gist of my 1994 Subversive Proposal:

http://www.arl.org/sc/subversive/

But there are currently still interoperability problems with
institutional servers, so the colossal success of Los Alamos has shown
that we will reach the optimal and inevitable faster by taking both
routes, the centralised and the distributed one:

http://xxx.lanl.gov/cgi-bin/show_monthly_submissions

> There are two advantages that I can see of a subject
> specific archive:
>
> - It can be properly maintained (it is a true archive)
> - It can be a 'one stop shop' of where to look for items on a specific
>   subject.
>
> I have no problem with the first role. It is the second that carries the
> possibility of monopoly. As long as the archive is maintained by a neutral
> organisation (like a large University) this is OK but what if it should
> become privatised?

EVERYTHING runs the risk of being "privatized": Universities, Los
Alamos, NIH. Fighting against the privatization-frenzy in whose grip the
entire planet seems to be at the moment is a worthy enough mission, but
it is completely irrelevant to the centralization/monopoly red herring
that I believe you are preoccupied with -- for the simple reason that
the menace of privatization is completely nonspecific, and afflicts ALL
options, in principle.

In practise, I would not worry too much about a hostile take-over of NIH
by the private sector in the near future, nor about NSF tossing the
Los Alamos Archive to the Trade Winds. Besides, one of the STRENGTHS of
"centralization" is that the authors that have put their precious eggs
in the collective basket and the users who forage them tend to monitor
them zealously day and night, and are likely to squack vociferously if
they sense any threat:

Taubes, Gary.  E-mail withdrawal prompts spasm. (temporary
shut-down of Los Alamos Laboratory e-print archives succeeds in
raising funds) Science v262, n5131 (Oct 8, 1993):173 (2 pages).

ABSTRACT: Paul Ginsparg shut down the e-print archives of Los
Alamos National Laboratory, the physicists' pre-publication
bulletin board for a few days.  The closure incited users to
petition the Department of Energy and National Science Foundation
for funds and secured official funding from Los Alamos.

> Once an archive (or its mirrors) is seen as 'the place'
> to search for items of interest and access to that archive can be
> controlled it might be temping to place some restriction on access like
> payment of a fee (for purely reasonable reasons like getting enough money
> to maintain the archive).

A lot of other networked services are likely to get a price tag before
the tiny refereed literature archive is likely to: It is the flea on the
tail of the dog, and we will all be best served if it is given a free
ride. Again, this worry is papyrocentric and misplaced.

> Now I know the actual quality control/validation
> is provided elsewhere (maybe by the 'old' journals, maybe by other
> players) but from the point of view of the author they may also need to be
> in the archive as well as have the validation/stamp of approval of an
> external organisation.

This sentence was a bit difficult to decode, but from what I can make of
it, one entity (the established journals -- why on earth not?) can
continue to do the quality controlling and certification-tagging, and
another (new, virtual) one, the Archive, can provide free access to the
texts.

What is the problem?

> > As I have noted before, this central/distributed issue is a red
> > herring, based in part on papyrocentric thinking (we are in reality
> > talking about a distributed virtual library where locus has little
> > meaning)
>
> You seem to contradict yourself here. If 'locus' (I don't mean physical
> position) has no meaning why do we need a Physics archive, or a Biomed
> archive, or any other subject archive? Why can't we either have one
> universal archive which simply stores and serves on request (at no cost
> and forever) any item sent to it, or no archive at all with items being
> stored on a user site or a Universit

Re: Central vs. Distributed Archives

1999-06-29 Thread J.W.T.Smith
Professor Harnad,

On Mon, 28 Jun 1999, Stevan Harnad wrote:

> On Mon, 28 Jun 1999, J.W.T.Smith wrote:
>
> > My objection to the Los Alamos Archive model is that it is centralised and
> > such a model can easily degenerate into a monopoly.
>
> A monopoly of what PRODUCT, on behalf of what PROVIDER relative to what
> MARKET? For Los Alamos is in the (government-supported) "business" of
> making it possible for authors to give away reports of their own
> scientific research away to one and all for free.

A monopoly in the sense that it could become 'the place' where readers
look for items relevant to their subject. The non-presence of an article
in a recognised subject specific archive could imply it is not relevant to
the subject. More on this later.

> And what do you mean "centralised"? Los Alamos is open to one and all,
> reader and author alike, the world over; it is mirrored in 15
> countries, cached in who knows how many other places and ways,
> incorporated into further Gateways such as NCSTRL and Spires, and there
> integrated with other archives. Anyone else can make copies of the
> archive too (that's part of what make the "product" free entails), and
> the authors who self-archive in it are encouraged to archive their
> papers elsewhere too, if they wish, including in their own
> institutional servers, which can then be gathered together as another
> backup of the "central" archive.

You are missing the point. I am not concerned with its availability, I am
concerned with the implied validation of the presence of an item in a
given archive. Even if the archive is mirrored it is a mirror of somewhere
and the address of that somewhere has value. If this has no value why to
we need an archive at all? Why don't we all mount our papers on our
University servers? There are two advantages that I can see of a subject
specific archive:

- It can be properly maintained (it is a true archive)
- It can be a 'one stop shop' of where to look for items on a specific
  subject.

I have no problem with the first role. It is the second that carries the
possibility of monopoly. As long as the archive is maintained by a neutral
organisation (like a large University) this is OK but what if it should
become privatised? Once an archive (or its mirrors) is seen as 'the place'
to search for items of interest and access to that archive can be
controlled it might be temping to place some restriction on access like
payment of a fee (for purely reasonable reasons like getting enough money
to maintain the archive). Now I know the actual quality control/validation
is provided elsewhere (maybe by the 'old' journals, maybe by other
players) but from the point of view of the author they may also need to be
in the archive as well as have the validation/stamp of approval of an
external organisation.

> As I have noted before, this central/distributed issue is a red
> herring, based in part on papyrocentric thinking (we are in reality
> talking about a distributed virtual library where locus has little
> meaning)

You seem to contradict yourself here. If 'locus' (I don't mean physical
position) has no meaning why do we need a Physics archive, or a Biomed
archive, or any other subject archive? Why can't we either have one
universal archive which simply stores and serves on request (at no cost
and forever) any item sent to it, or no archive at all with items being
stored on a user site or a University site or a commercial site (or all
three or some other option/permutation)?

> Stop thinking in terms of a reader-end "product," with competition
> among access-blockers, and think instead in terms of a platform for
> author-end "freebies," with collaboration among access-providers, and
> things will come into better focus. This is the refereed journal
> literature, not trade books or magazines.

You are preaching to the converted. I have been aware the trade model is
wrong for academic publishing for many years. There have been proposals to
replace this model going back to the 1920s or before. Nothing new here.

> > Summary: It is possible to escape the problems of the 'trade model' of
> > current academic publishing without running headlong into the possibly
> > equally constraining model of a monopolistic central archive.

Yes. Change the vocabulary.

Why don't you drop the word 'journal' then? Why not use 'validator' or
some other word that indicates the role and doesn't carry over
connotations from the old "papyrocentric" model?

John Smith,
University of Kent at Canterbury, UK.