Re: eprints and authentication

2000-11-09 Thread Rzepa, Henry
Why should a pdf be locked? Getting away from the idea that work is always
on paper says to me that it should not be read-only *at the user end*. The
emerging means of authentication described by Adrian should be an excellent
way forward, but why the need to lock as well?

I ask because for projects such as ours, which involves adding third-party
reference links to pdf documents, locking is not insurmountable but is
against the principle of what we are trying to demonstrate.


I prefer to use the term signed.  This authenticates the document
(or a fragment of it) but does not prevent others from  re-using
it (although the original signature is now invalidated if they do
quote it with changes).
  A document can be signed many times by many people of course.

I am convinced that as we move into an information anywhere and
from anywhere era, the need to know which bit came from where
and when becomes essential.
--

Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0)20 7594 5804 (Fax)
Dept. Chemistry, Imperial College, London, SW7  2AY, UK.
http://www.ch.ic.ac.uk/rzepa/


Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Wed, 8 Nov 2000, Greg Kuperberg wrote:

 While libraries certainly should help preserve e-prints, I do not trust
 any one library, nor any other sole institution, to archive material
 single-handedly. Any caretaker can lose or destroy a unique copy of
 any document...  That is why it is important to redundantly and
 openly mirror an archive and not just allow third-party searches. The
 arXiv has 18 mirror sites on six continents

Who is disagreeing with this? All requisite redundancy is just as
desirable, and feasible, and inevitable, with institution-based
distributed archiving as with discipline-based archiving.

I think there is an incorrect analogy at the heart of Greg's frequent
use of the term fragmented in speaking about the institution-based
approach to self-archiving:

I think Greg continues to equate (1) archiving with publishing, and
(2) institutional digital collections with localized books-on-shelves
(ripe for a Library-of-Alexandria catastrophe; hence his example of the
lost/destroyed unique document). And (3) (unrefereed, unpublished)
PREprints continue to be treated as the paradigm for it all, whereas
it is much more informative and representative to see it in terms of
(refereed, published) POSTprints: We are, after all, aiming at freeing
the REFEREED literature -- with the prepublication embryological stages
merely an added bonus, rather than the focus of it all.

So, to summarize: Whilst, our refereed papers are already, as they are,
safely in the hands of journals and libraries, blissfully mirrored
(though unblissfully unfree), we need not fret about Alexandria.
Freeing a postprint (sic) via self-archiving (whether central or
institutional, interoperable or not) is a bonus, a plus, a freebie, a way
to make it accessible to those multitudes worldwide who cannot access
it because of the S/L/P firewalls surrounding the safe, Alexandria
versions.

It is inviting Zeno's Paralysis (again) to say: Keep waiting till you
have an Alexandria-proof centralized, mirrored, redundant arXiv-style
Archive to self-archive them in before you dare to self-archive your
(already safely mirrored) postprints.

Nay! Release them from their hostagehood behind obsolete,
impact-blocking, and completely surmountable access barriers online
today through self-archiving, addict fellow-researchers the world over
to that new, free form of access to it all, and the redundancies and
mirrors will come tomorrow, in plenty of time to keep the freed corpus
aloft in the skies. (And nothing is at risk: the firewalled version
remains as safe -- from catastrophic loss as well as illicit access --
as it ever was.)

If that is now transparent for postprints, it should be equally
transparent that the same applies to preprints: They are destined to
become postprints (hence secure, for the above reasons) anyway. Being
available online early is a bonus; a freebie. Moreover, it is bonus
that has no prior history of enjoying the safe/secure status of
postprints anyway: access to preprints was always restricted and
evanescent, destined to be superseded by the secure postprint once it
was available.

Now the redundancy and mirroring that will be accorded the freed
postprint corpus, once it is freed, will also be inherited by the
preprint corpus.

So there is nothing to lose, and everything to be gained, by
self-archiving all preprints and postprints now, in either the
centralized OAI-compliant (http://www.openarchives.org) archives like
arXiv (http://arXiv.org), or in institutional OAI-compliant archives,
like Eprints (http://www.eprints.org).

Ignore Cassandras: Preservation problems are eminently soluble, once
the goods are up there: the real problem now is how to get researchers
to put them up there, at long last. Central archives have gone part of
the distance but are proving too slow. Institutional archives are natural
allies in hastening us on the road to the optimal and inevitable.

 As a rule, it is better for web sites to share the same archive than
 to each have fragments. It is better for Oxford and Cambridge to
 each have all of Shakespeare's plays than for Oxford to have only the
 comedies and Cambridge to have only the tragedies. That is why I favor
 shared interoperability, which is in some ways centralized, to fragmented
 interoperability, which is optimistically called decentralized. Massive
 redundancy is one of the few strengths of the existing paper-based system;

I am not an expert on digital storage, coding or preservation, but I am
not at all sure that Greg is technically right above (and I'm certain
that the Oxford/Cambridge hard-copy analogy is fallacious). I would
like to hear from specialists in localized vs. distributed digital
coding, redundancy, etc. -- bearing in mind that in the case of the
refereed literature, this is all moot anyway, because free access now,
is infinitely preferable to no access, no matter how short-lived it
risks being. The locus classicus is still safely ensconced behind the
toll 

Re: Exponential growth

2000-11-09 Thread Stevan Harnad
On Wed, 8 Nov 2000, Greg Kuperberg wrote:

 Maybe you want to say more conservatively that new submissions should be
 superlinear, i.e., concave up.

Yes, yes, that's it.

(And that's: new self-archived eprint (whether pre- or post-), NOT
new submission. Submission is for journals. Self-archiving is better
described as a deposit.)

 And maybe instead of asymptotics you are interested in the
 short term.  In that case the right way to say it is that you open
 archiving should grow faster in the near term.

Yes, it should go concave up, steeply, until the entire (finite)
current refereed corpus is up there, online and free.

And I do mean steeply. There is no reason it should not all have been
up there, freed, yesterday, so certainly no reason to drag it out for
another decade.

As to asymptotics: I am referring to the current refereed corpus;
this annual corpus is finite though also itself growing somewhat
annually, but not nearly so fast as to require my refining the shape of
the curve: the sharp concave up covers it all...



Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98  99  00):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2000-11-09 Thread Thomas Krichel
  Greg Kuperberg writes

 But I disagree entirely with the claim that distributed
 interoperability has never been tried before.  It has been tried several
 times, whole-heartedly with these two projects:

 MPRESS - mathnet.preprints.org
 NCSTRL - ncstrl.org

 And it has been a factor in many other projects, including Hypatia
 and the AMS preprint server.  Some of these projects are more
 successful than others, but *all* of them suffer from inconstancy
 of the underlying archives.

  The largest project that has been done with a distributed
  interoperability is RePEc. RePEc catalogs 11 items now.
  While there is the occasional case that an archive my become
  obsolete, from about 140 archives, I think 5 have been made obsolete,
  i.e. have been moved  to a place outside the original archive
  maintainer's control. Thus while it is problem, it is not a minor one.
  It is by far outweight by other advantages, such as distributed costs,
  minimum quality control, and wide community partipation.

  Cheers,

  Thomas Krichel http://openlib.org/home/krichel
 RePEc:per:1965-06-05:thomas_krichel

  2000-10-05 to 2001-01-06:
  Institute for Economic Research / Hitotsubashi University
  2-1 Naka / Kunitachi / Tokyo 186-8603 / Japan / +81(0)42 580 8349
  tho...@micro.ier.hit-u.ac.jp


Re: Central vs. Distributed Archives

2000-11-09 Thread David Goodman
Steve, I think you misunderstand Greg's concern (and mine)
We do not disagree with what you want to do; we want to add to it. We are
assuming, I think,
that something similar to the plan you advocate will be the basic process.

I do not think it enough to say distributed=secure. It's only the first step
to security.
In addition to being distributed, there also needs to be a reliable
caretaker--not just to do the housekeeping, but to ensure that the archive is
kept compatible with changing technology.
I suggested that the archives be organized redundantly both by discipline and
by university (and possibly by geographic/political entity, as well as what
anyone wants to do).

There are undoubtedly well-organized academic departments that can do this.
There are also academic departments that cannot be relied on to do this right,
because of size, interest, or finances. The same goes for professional
societies. Certainly no individual can be relied on: all humans are mortal.
All of this goes as well for refereed as for unrefereed, preprint as for
reprint, officially published as for unpublished.

As a librarian, I do not assume it is good enough that
 our refereed papers are already, as they are,
 safely in the hands of journals and libraries, ...

There are very few library copies of many journals, and though there is
excellent backup from national libraries, even their collections are
incomplete. The literature published up to now will be much more secure when
it too has been digitized and placed on free publicly available mirrored
servers, with all the additional precautions. Besides security, this will also
make them generally available with all the additional advantages of plans such
as yours.


Re: Central vs. Distributed Archives

2000-11-09 Thread Tim Brody
  Greg:
  As a rule, it is better for web sites to share the same archive than
  to each have fragments. It is better for Oxford and Cambridge to
  each have all of Shakespeare's plays than for Oxford to have only the
  comedies and Cambridge to have only the tragedies. That is why I favor
  shared interoperability, which is in some ways centralized, to fragmented
  interoperability, which is optimistically called decentralized. Massive
  redundancy is one of the few strengths of the existing paper-based system;

 Stevan:
 I am not an expert on digital storage, coding or preservation, but I am
 not at all sure that Greg is technically right above (and I'm certain
 that the Oxford/Cambridge hard-copy analogy is fallacious). I would
 like to hear from specialists in localized vs. distributed digital
 coding, redundancy, etc. -- bearing in mind that in the case of the

If I may separate the political issues from the technical.

Political:

There is a fear that a decentralised system will result in no overall
responsibility for archive continuity. But, equally, a centralised
body can decide that a system is no longer useful or is too expensive
to be free - what happens if XXX goes pay-per-view? What rights do
mirrors have to store XXX if they are told to remove their archive?

Technical:

The fear is that there will be only one copy of a paper stored in an
institution department or library and if that archive is lost that
paper disappears into digital oblivion.

Data storage is very cheap - there is little difference between storing
1 or 100 copies. Oxford and Cambridge could farm all world physics
archives and store their contents. This is not currently done because
Open Archives include pay-per-view archives, where only the abstract
can be farmed - and hence there is no provision for farming of texts.

I may also point out that there are already archives that perform
distributed mirroring - math arXiv is primarily made up of papers that
have been archived elsewhere (judging by the lack of associated meta
data and updates).

Tim Brody
Computer Science, University of Southampton
email: tdb...@soton.ac.uk
Web: http://www.ecs.soton.ac.uk/~tdb198/


Re: Central vs. Distributed Archives

2000-11-09 Thread Stevan Harnad
On Thu, 9 Nov 2000, David Goodman wrote:

 Steve, I think you misunderstand Greg's concern (and mine) We do not
 disagree with what you want to do; we want to add to it. We are
 assuming, I think, that something similar to the plan you advocate will
 be the basic process.

 I do not think it enough to say distributed = secure. It's only the first
 step to security. In addition to being distributed, there also needs to
 be a reliable caretaker--not just to do the housekeeping, but to ensure
 that the archive is kept compatible with changing technology.

I agree completely.

I didn't say distributed = secure (there's a lot more to security than
that). I said being freely accessible now, in distributed institutional
Eprint archives is a powerful new way to complement being freely
accessible in centralized Eprint archives, which are still growing much
too slowly. It should not be delayed for one moment by security
concerns, not one moment.

 I suggested that the archives be organized redundantly both by
 discipline and by university (and possibly by geographic/political
 entity, as well as what anyone wants to do).

Again, complete agreement.

 There are undoubtedly well-organized academic departments that can do
 this. There are also academic departments that cannot be relied on to
 do this right, because of size, interest, or finances. The same goes
 for professional societies. Certainly no individual can be relied on:
 all humans are mortal. All of this goes as well for refereed as for
 unrefereed, preprint as for reprint, officially published as for
 unpublished.

Agreed, and digital librarians are clearly the pertinent experts.

 As a librarian, I do not assume it is good enough that our refereed
 papers are already, as they are, safely in the hands of journals and
 libraries, ...

Yes, but let us not again mix up agendas. There could have been --
independent of any movement to free the refereed literature online -- a
movement to increase the security of the on-paper corpus (both papers
and books) on-line.

That's fine, desirable, but unrelated to this Forum's agenda, which is
to FREE the refereed corpus online. Concerns about strengthening the
paper literature's current security should not be wrapped into the
freeing (now!) initiative for the refereed literature; nor should
freeing (now!) be made in any way conditional on first meeting a priori
security concerns. Although it is an oversimplification, it is best to
treat the freeing initiative as a pure freebie, a windfall, over and
above what we have already. We are talking about archiving, not
publishing, an extra version of what is already published (on-paper).

This face-valid, immediate goal should be kept as distinct from
preservation concerns as it should be kept from peer-review-reform
concerns (likewise worthy, but orthogonal, and indeed even at
cross-purposes if yoked in any way to the freeing initiative).

 There are very few library copies of many journals, and though there is
 excellent backup from national libraries, even their collections are
 incomplete. The literature published up to now will be much more secure
 when it too has been digitized and placed on free publicly available
 mirrored servers, with all the additional precautions. Besides
 security, this will also make them generally available with all the
 additional advantages of plans such as yours.

David, the securing issue is a separate one from the freeing! The
material on the shelves now is not free; nor is it, let us agree, as
secure as it might be. Increasing its security by distributed digital
back-up is one thing (and need not be freely accessible either);
freeing it online is quite another.

Please, please keep these two separate or you will only encourage more
Zeno's Paralysis!


Stevan Harnad har...@cogsci.soton.ac.uk
Professor of Cognitive Sciencehar...@princeton.edu
Department of Electronics and phone: +44 23-80 592-582
 Computer Science fax:   +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southamptonhttp://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98  99  00):


http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

american-scientist-open-access-fo...@amsci.org


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 11:16:11AM +, Stevan Harnad wrote:
 Nay! Release them from their hostagehood behind obsolete,
 impact-blocking, and completely surmountable access barriers online
 today through self-archiving, addict fellow-researchers the world over
 to that new, free form of access to it all, and the redundancies and
 mirrors will come tomorrow, in plenty of time to keep the freed corpus
 aloft in the skies.

Entirely aside from whether your proposals are the best ones, you have
previously described them as being nothing other than the Ginsparg
model.  Well I think of myself as devoted to the Ginsparg model,
but my interpretation of it is significantly different from the one
that you give here.  In 1997 my thinking was much more like yours,
but three years of direct experience with the arXiv has changed it.
My creed is, build a large, integrated, immortal archive now, and the
e-prints will come tomorrow.  I won't insist that this approach is right
for your discipline, because maybe you know your own community better
than I do.  But I do feel strongly that it is right for my discipline.
And I can't speak for Paul Ginsparg either, but I would be surprised
if he contradicted me outright, since he has influenced my thinking a
great deal through direct correspondence.

In general your liberation terminology doesn't sit so well with me.  I do
hint at liberation terminology from time to time; in fact the name of my
front end, Front for the Mathematics arXiv, is a deliberate allusion.
If the math arXiv is revolutionary, I would liken it to the American
revolution.  We are building a new system on new territory and letting
immigrants come.  I see a lot of Alexander Hamilton in our approach, and
somewhat less of Thomas Jefferson.  Your comments have some character
of Jefferson, but very little of Hamilton, and often they sound almost
Marxist.  I might compare your overall vision to the Communards of Paris.
But hey, you could be right in your own society.

You have also correctly picked up that I don't accept the dichotomy
between preprints and postprints.  My view is that the preprint
and the postprint are Tweedledum and Tweedledee.  But that is a topic
for another posting.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 05:58:14PM +, Tim Brody wrote:
 I may also point out that there are already archives that perform
 distributed mirroring - math arXiv is primarily made up of papers that
 have been archived elsewhere (judging by the lack of associated meta
 data and updates).

I don't understand this comment.  Most of the papers in the math arXiv
are eventually published, and many are in preprint series of one sort
or another.  However I conjecture that at least half of the submissions
in the most recent three months are not on any other web site, not
even on a home page.  And for those that are not published or not yet
published, the arXiv is the only project that explicitly promises to
keep them permanently.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *


Re: Central vs. Distributed Archives

2000-11-09 Thread Greg Kuperberg
On Thu, Nov 09, 2000 at 07:16:47PM +, Stevan Harnad wrote:
 I don't think sublinear or linear growth is right for
 your discipline (maths) either...

Of course more growth is better than less.  Several of us (both the arXiv
staff led by Paul Ginsparg and the math advisory committee chaired by Dave
Morrison, on which I serve) have worked hard to accelerate the growth of
the math arXiv.  I can report a partial victory.  The archives that we
glued together were at best growing linearly with a low slope and were
showing some signs of sublinearity.  After we put them together there was
a discontinuous increase in new submissions, and linear growth commenced
with a higher slope.  I don't have a chart but the numbers are there at

http://front.math.ucdavis.edu/math

After we had changed so much, I was surprised that growth was still
linear.  (Paul Ginsparg wasn't surprised.)  I now believe that linear
growth in e-prints is inherent.  But both the discontinuity and the
one-time change in slope were heartening.  That is a realistic goal when
you change the system.
--
  /\  Greg Kuperberg (UC Davis)
 /  \
 \  / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/
  \/  * All the math that's fit to e-print *