Re: What percentage of preprints is never accepted for publication?
We continue on the interesting (but alas evidence-poor) question of discipline-differences in the preprint/postprint difference (DIFF), and, in particular, the question of what percentage of submitted papers never gets published by any journal in any form. On Wed, 6 Dec 2000, George Lundberg wrote: large numbers of papers submitted to biomedical journals are of insufficient quality to appear (either at all or in the form in which they were originally submitted/rejected) in any good journal Helene was asking about the percentage in the not at all category, rather than the revise-and-resubmit category, although both would be of interest (if only anyone had actual data!). i am not at all sure that Stephen Lock's frequently quoted 1984 number bears any relation to current experiences Lock reported that in biomedical research just about everything eventually appears somewhere, in some form. So in the end the function of peer review is to determine where (and, equally important, in what form, with what content) a paper should appear: Peer review is not a passive red-light/green-light filter, it is a dynamic, interactive, iterative, corrective filter that actively changes the contents and form of preprints. Lock, Stephen. A difficult balance : editorial peer review in medicine / Stephen Lock. Philadelphia : ISI Press, 1986. So, as a dynamic quality-shaper and certifier, peer review sign-posts the level of quality of a paper at the locus where it eventually appears -- a hierarchy of journals, from those with the highest quality, rigour of refereeing, rejection rate, and impact factor at the top, grading all the way down to journals so unrigorously reviewed as to be little more than a vanity press. (I am describing the standard lore here: I do not have data either.) The function of this sign-posted hierarchy is to guide the reader and the user, who have finite reading time and research resources, and need to make sure they are reading reliable work, worth taking the risk of building upon and worth citing. Researchers can pick their own level, depending on their time, resources, and the aspired quality level of their own work. They can decide for themselves how low in the hierarchy they wish to go. At JAMA for my 17 years we rejected roughly 85% of all articles received. Many did appear in other journals, but a huge number seemed to simply disappear. We believed that was a good thing. i do not know of any recent study that hangs credible numbers on those observations. Nor do I know of recent studies on this. (Does anyone?) But note that apart from JAMA's 85% rejection rate (which attests to its being one of the journals at the top of the clinical-medical hierarchy, along with NEJM, Lancet and BMJ), George is not in a position to provide objective data on what proportion of JAMA's rejected papers never went on to appear anywhere, in any form. That would require a systematic follow-up study (taking into account, among other things, title changes, and possibly stretching across several years after the original rejection). It would be splendid if someone gathered (or already had) such data. I think we can all agree that in clinical medicine, where erroneous reports can be hazardous to human health, it would be a good thing if they never appeared anywhere, in any form. But in the online age especially (what with child porn and hate literature proving so difficult to suppress), this problem is well beyond the powers of journals and journal editors. Harnad, S. (2000) Ingelfinger Over-Ruled: The Role of the Web in the Future of Refereed Medical Journal Publishing. Lancet (in press) http://www.ecs.soton.ac.uk/~harnad/Papers/Harnad/harnad00.lancet.htm In the vast majority of research that has no bearing on human health and welfare, however, it is not clear how strongly we should be believing that it would be a good thing if a a huge number of preprints rejected at one level of the hierarchy simply disappeared rather than moved downward till they found their own level (including, at the very bottom, permanent unrefereed status in the preprint sector of the eprint corpus -- the eprint archives' vanity press). Who is to say what would be a good thing here for research, across disciplines, a priori? This is the problem of the wheat/chaff ratio that inevitably dogs every area of human endeavour: We would like to have only the cream, and not the milk, but alas not only does human performance invariably take the shape of a bell curve, but there is no known way of ensuring that one can filter out the top 15% of that curve without letting it all flow. (Not to mention that, peer review, being human too, often misfilters, mistaking [to mix metaphors] wheat for chaff and vice versa. The only protection against this is time, and a retrospective record, for possible second thoughts about a piece of work.) Harnad, S. (1986) Policing the Paper Chase. (Review of S. Lock, A difficult
Re: Number of pre-prints relative to journal literature?
I am curious about the statistics concerning the number of e-prints relative to the journal literature. Harnad repeatedly mentions that LANL archive contains 40% of the journal literature at present and compares it to 20% of the math archive. Here's one way to estimate it for the physics arXiv: percentage of current citations by papers in within arXiv not papers not within arXiv (courtesy of Les Carr, Zhuoan Jiao, Tim Brody Ian Hickmen): http://www.ecs.soton.ac.uk/~harnad/Tp/Tim/sld003.htm There are other ways to estimate it too. See: http://opcit.eprints.org/ijh198/ http://opcit.eprints.org/tdb198/opcit/ For an estimate of what percentage of the current maths literature is in the maths arXiv, I will let Greg Kuperberg reply. Stevan Harnad har...@cogsci.soton.ac.uk Professor of Cognitive Sciencehar...@princeton.edu Department of Electronics and phone: +44 23-80 592-582 Computer Science fax: +44 23-80 592-865 University of Southampton http://www.ecs.soton.ac.uk/~harnad/ Highfield, Southamptonhttp://www.princeton.edu/~harnad/ SO17 1BJ UNITED KINGDOM NOTE: A complete archive of the ongoing discussion of providing free access to the refereed journal literature online is available at the American Scientist September Forum (98 99 00): http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html You may join the list at the site above. Discussion can be posted to: american-scientist-open-access-fo...@amsci.org
Re: The preprint is the postprint
On Wed, 6 Dec 2000, Greg Kuperberg wrote: On Wed, Dec 06, 2000 at 08:42:55PM +, Stevan Harnad wrote: sh The analogy with food quality control (let us say, mushrooms), sh is that the inspectors decline to certify a grower's mushrooms sh (preprints) as fit for human consumption until the grower does sh whatever is required to produce mushrooms to that standard sh (postprints). You still don't rename them. It's not as if they are toadstools before certification and mushrooms after. You are missing the point: It is unfit for human consumption before (preprints) and fit for human consumption after (postprints). The paper's name (title) does not change any more than the mushroom's does. But if the quality-control has been substantive, it is NOT THE SAME PAPER ANY MORE, as it has been substantively revised. By the same token, the mushroom-grower is not coming back with the SAME MUSHROOMS that were certified unfit for consumption last week, and having them certified as fit for consumption this week; something about the growing practises underlying this week's batch had to change in response to the feedback from the FDA, if they are now certifiably fit. And I see a substantive point behind this semantic one. A safety measure is not usually so inviolate that it makes sense to rename the object of scrutiny. There are people who divide society into people and criminals. Surely you would agree that that is belligerent terminology. And irrelevant to the issue at hand, which concerns certification as fit for peer consumption -- or, in keeping with the agricultural analogy, and the journal quality hierarchy, an egg analogy this time: fit for use as Grade A, for those who wish to restrict their baking to Grade A eggs. I already gave what I consider evidence, although I wouldn't expect it to sweep away deep skepticism. I am afraid all you gave was anecdote and opinion. What we need to see is the objective data (as Les Carr pointed out) on the size of the preprint/posprint DIFF and all of the other quantitative generalizations you (and I) were making. I, however, have the advantage of being in the default position: The null hypothesis is that the current quality of the peer-reviewed literature (and hence the size of the preprint/postprint DIFF) is causally related to the fact that it is indeed peer reviewed. The burden of evidence is on those who believe there is no preprint/postprint DIFF, or that peer review is not the causal basis of current quality levels. sh why [if DIFF = 0, do] mathematicians keep sh submitting the vast majority of their work to the journals for sh refereeing and certification anyway, for all the world EXACTLY like all sh the other disciplines? In my case, to get promoted. My own department is qualified to judge letters of recommendation, which are an outgrowth of informal peer review of my papers. But the higher administration is not. The administration has taken ritualized peer review as a standard, even though the ritual has sometimes degenerated. Nolo contendere. gk research in mathematics is... gk rigorous enough that self-appointed critics gk can quickly earn credibility. sh Will this sort of anecdotal phenomenon scale, even within sh mathematics let alone the rest of the disciplines? This is more than an incidental anecdote; this is the daily diet in my profession. If you don't believe me, you should take a survey of mathematicians to see if they have ever worried that someone might find a mistake, or a trivializing shortcut, when they give a talk. Maybe not all mathematicians are afraid of that, but if your survey wouldn't find many then I must be living on the wrong planet. Survey in the works (for a preliminary peek, see below; please send suggestions to Cathy Hunt chh...@ecs.soton.ac.uk) We were planning to do it only with Physics arXiv and CogPrints users, but if you'd mediate, Greg, we'd be happy to survey math arXiv authors too): http://www.ecs.soton.ac.uk/~chh398/arXiv.php3 But do you think other disciplines worry much less, a priori, about a mistake or slip-up? No one wants egg on their face. But that's not enough to guarantee they will keep their noses clean. (Quality control is a Quis Custodiet? problem.) One interesting consequence of the [permanence] policy is that you can search for all of the withdrawn papers, meaning those in which the latest version begs the reader not to read previous versions: http://front.math.ucdavis.edu/search/withdrawn One proposed name for this list is The Avenue of Broken Dreams. Do you consider this to be an incentive toward self-archiving, in general? In mathematics and hard science, absolutely. In other disciplines, I don't know, but it could have merit. Again, all I can reply is that this sounds very unlikely to me. Comments from others would be welcome. There is some truth in [the police-in-the-neighborhood] analogy, since many people say
Re: UK RAE Evaluations
Following the recent postings to this list concerning RAE 2001, I thought that I would consult the RAE 2001 website (http://www.rae.ac.uk/). 1) Regarding the importance or not of the impact factors of journals, the following is stated: http://www.rae.ac.uk/ASP/GuideFAQ/ShowQ.asp?QID=10 Is there a hierarchical list which attributes weight to published research according to the place of publication? No. While panels may take into consideration the degree of peer-review an item of research output may have before publication, no panel may take the absence of peer-review as meaning a lack of quality within any given item of research output. Hierarchical lists of weightings are not used in the assessment process. Panel members form judgements on all the evidence presented in the round, with a full awareness of the wider context in which they are assessing output. 2) The types of 'output' which may be submitted in the RAE 2001 are listed as (clearly including online publications, which often do not have page numbers) Authored book, Software, Composition, Edited book, Report for external body, Design, Chapter in book, Confidential report for external body, Exhibition, Journal article, Internet publication, Artefact, Conference contribution, Internet publication (via subscription only), Scholarly edition, Patent/ published patent application, Performance and 'Other form of assessable output'. http://www.rae.ac.uk/Pubs/briefing/note4.htm Dr Jamie Humphrey CChem MRSC Managing Editor, Electronic Journals Royal Society of Chemistry, Thomas Graham House, Milton Road, Science Park, Cambridge, CB4 0WF, UK Tel +44 (0)1223 432139, Fax +44 (0)1223 420247 E-mail humphr...@rsc.org www.rsc.org and www.chemsoc.org
Re: The preprint is the postprint
On Thu, Dec 07, 2000 at 02:04:02PM +, Stevan Harnad wrote: I already gave what I consider evidence, although I wouldn't expect it to sweep away deep skepticism. I am afraid all you gave was anecdote and opinion. The quantifiable evidence is that only a fraction of arXiv users, about 20% in math, ever add the journal reference to their own papers in the arXiv. Generally speaking authors would prefer the journal reference to be there, but the reason gives is usually Oh, I haven't gotten around to it. Evidently they have only a weak incentive to add this attribute for the reader's benefit. If the journal title were such a crucial stamp of quality it would be different. This is consistent with my own perceived incentives as a research mathematician. I do systematically add the journal references to the arXiv, but that is because of my involvement in the project and my librarian tendencies. I have never seen it as a pressing concern. By contrast when I write a new paper I can't wait to send it to the arXiv (so that everyone will see it) or submit it to a journal (to get credit from my university). But do you think other disciplines worry much less, a priori, about a mistake or slip-up? I won't speak for other disciplines, but I do see some difference between a talk in pure math or string theory on the one hand and a talk in computational math or experimental physics on the other. Experimental papers are founded on data, while computational math talks are founded on simulations. The audience is not usually in a position to question the raw data; the most that a listener could do is find a mistake in the interpretation. But for most pure mathematics, all you have is the arguments presented (or at least outlined) in the talk. If you are trying to convince other experts of your results for the first time, that is really the moment of truth. Even very good mathematicians have seen their new results crumble to dust at that moment. If you're careful you can avoid outright fallacy, but there is no conclusive way to determine whether your hard theorem has a 3-line proof. I can also say that mathematics, unlike some disciplines, does not normally divide into factions that dismiss each others theories as wrong. Very occassionally you see that in applied mathematics, but most people see it as something that shouldn't happen and not as the status quo. So if someone alleges a mistake in your work, you don't normally get any protection from your side. (On the other hand, there are factions of mathematicians who allege that each other's work is unimportant. But unimportant is very different from wrong.) There is a corresponding difference in anonymous refereeing in math versus physics. In math many referees still systematically check the work under review. Physics is much closer to the standard of simply judging whether or not a paper is important, not whether or not it is correct. As a result refereeing in mathematics takes longer than in physics. A referee sitting on a paper for a full year is almost unheard of in physics; in math it is quite common. I suspect that for the same reason the avenue of broken dreams, ie. the withdrawn papers, is proportionately longer in the math arXiv than in the physics arXiv. -- /\ Greg Kuperberg (UC Davis) / \ \ / Visit the Math ArXiv Front at http://front.math.ucdavis.edu/ \/ * All the math that's fit to e-print *
Re: Number of pre-prints relative to journal literature?
On Thu, 7 Dec 2000, Stevan Harnad wrote: [ Origin of statistics about the coverage of scientific literature by XXX ] Here's one way to estimate it for the physics arXiv: percentage of current citations by papers in within arXiv not papers not within arXiv (courtesy of Les Carr, Zhuoan Jiao, Tim Brody Ian Hickmen): http://www.ecs.soton.ac.uk/~harnad/Tp/Tim/sld003.htm There are two questions: 1) What percentage of the _current_ output of literature is being arXived? 2) When looking for cited work, what percentage could I find in the arXiv? 1) For High Energy Physics (for which statistics covering all published work can be obtained from SPIRES), the percentage of papers arXived is almost 100%. I have no data to cover other areas, but it must be noted that most areas of XXX are seeing increased depositing, whereas HEP is almost static. I would hypothesise that this is because other areas do not have a high percentage of all literature being archived. 2) For the whole of the archive this is around 30-40% (with the HEPs having a larger percent), with the result that it will be another 10 years before all cited work has been archived (assuming the typical lifespan of a paper is 5-7 years). This length of time could be reduced by authors archiving existing literature. All the best, Tim Brody