Re: OA advantage = EA + AA + QB + OA + UA
On 20-Oct-04, at 10:28 AM, Stevan Harnad wrote: I think the interpretation of this is fairly clear: Once there is 100% OA, research is used far more, and although the overall number of references per article may not increase, their *selectivity* does, because authors can cite what is most important and relevant, rather than just what their institutions happen to be able to afford to access (as is the case before there is 100% OA, which is the prevailing condition in all fields other than astro currently). As Stevan points out, the citation advantage for OA as opposed to non-OA materials is only demonstrable in traditional numeric terms for fields that are only partially OA. Once 100% OA is achieved, then all articles and authors have an equal research impact advantage. Here is my perspective, as someone who is no expert on bibliometrics: There are other obvious impact advantages of OA in a 100% OA field, which may need different types of measurements. One example is the increased quality of research that can proceed once researchers have ready access to all the scholarly knowledge in their fields. Another advantage that might not be picked up by traditional measurements is increased citations in journals that are not covered by western-based citation indexes. That is, researchers in developing countries will have access, and are likely citing articles, but citations in publications based in their home countries are not covered by current indexes. Another clue to increasing impact in terms of usage is the increase in downloads or readership that Stevan refers to. This may be the beginnings of evidence of impact beyond academe, that is, usage by professionals, teachers, students, etc. Thoughts? Heather G. Morrison Project Coordinator BC Electronic Library Network Phone: 604-268-7001 Fax: 604-291-3023 Email: heath...@eln.bc.ca Web: http://www.eln.bc.ca
Re: Do Open-Access Articles Have a Greater Research Impact?
Below is the latest evidence that the Open Access Impact Advantage is neither unique to the Physical Sciences and Mathematics: http://citebase.eprints.org/isi_study/ nor to the Biological Sciences: http://www.crsc.uqam.ca/lab/chawki/OA_NOA_biologie.gif The Impact advantage is there in the Social Sciences too: http://www.crsc.uqam.ca/lab/chawki/sociologie.htm The explanation for http://www.crsc.uqam.ca/lab/chawki/sociologie.htm is so far only in French (it will be translated shortly) but the English explanation for http://citebase.eprints.org/isi_study/ applies to the Social Science data too. Note that one significant difference between the Physical Sciences and the Social Sciences is that the rate of self-archiving is not increasing in the Social Sciences yet (correlation between number of OA articles and Year is positive for Physics/Mathematics, negative for Sociology/Anthropology). The OA impact effect is always positive except in the most recent year (2003), probably because the ISI citation counts are not yet up to date for 2003. Chawki Hajjem Doctoral Candidate Informatique cognitive Centre de neuroscience de la cognition (CNC) Université du Québec à Montréal Montréal, Québec, Canada H3C 3P8 tel: 1-514-987-3000 2297# fax: 1-514-987-8952
Re: OA advantage = EA + AA + QB + OA + UA
Prior AmSci Topic Thread: "OA advantage = EA + AA + QB + OA + UA" http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/3978.html A forthcoming article by Michael Kurtz (Harvard-Smithsonian Center for Astrophysics) and co-workers reports that in astrophysics -- which (with its small, closed circle of journals and with all active astrophysicists worldwide being at institutions that can afford toll-access to all of them) has had de-facto 100% OA for several years now -- the total number of citations (hence the average number per article) has not risen; in fact it may even have diminished a little. There is instead a threefold increase in usage (readership, downloads). Kurtz et al. (2004) "The Effect of Use and Access on Citations" Information Processing and Management (submitted) http://cfa-www.harvard.edu/~kurtz/IPM-abstract.html I think the interpretation of this is fairly clear: Once there is 100% OA, research is used far more, and although the overall number of references per article may not increase, their *selectivity* does, because authors can cite what is most important and relevant, rather than just what their institutions happen to be able to afford to access (as is the case before there is 100% OA, which is the prevailing condition in all fields other than astro currently). One certainly cannot take the absence of an overall increase of citations in a field that already has 100% OA, as evidence against the need for 100% OA in other fields, where OA is far less than 100%! Michael's interesting finding is probably unique to astro, which was even 100% OA before the online era (i.e., 100% of astrophysicists were at institutions that could afford 100% of the astro journals in paper), but his pattern of findings has suggested that there are several components contributing to the OA Advantage: (1) What Michael calls the "EA" or "Early Access" advantage: Papers that are self-archived as preprints, even in astro, get more citations than those that are not. If I understand Michael's data correctly, however, the EA is in fact a permanent increment in a paper's total cumulative citation count and not just a phase shift that reaches its peak earlier, without increasing the cumulative total of citations. This is probably because of a paper's autocatalytic usage/citation/usage/citation cycle, which Tim Brody has also detected, and is illustrated in Tim's forthcoming usage/citation correlation paper: Brody, T. and Harnad, S. (2004) Using Web Statistics as a Predictor of Citation Impact . http://www.ecs.soton.ac.uk/~harnad/Temp/timcorr.doc (2) The "AA" or "Arxiv advantage," which applies to both preprints and postprints: Even though they are all already 100% OA through institutional subscriptions/licenses, papers that are also self-archived in ArXiv get more citations. (In fields with distributed institutional self-archiving, AA would of course not be an ArXiv effect but an OAIster effect.) This advantage would no doubt vanish if toll-access and open-access were fully integrated, but it is interesting that it is present, even in a 100% OA field. http://www.arxiv.org/ http://oaister.umdl.umich.edu/o/oaister/ (3) The Quality Bias, "QB," which is the fact that the higher-quality, higher-impact authors tend to self-archive more overall, and that it is particularly their higher-quality (hence higher-impact) papers that authors tend selectively to self-archive more. This self-selection bias is definitely one of the factors underlying the positive correlation between OA and citation counts, but it is certainly not the only factor. It will be interesting to estimate the size of QB, relative to the other 3 factors, especially as OA grows from 0% to 100%. (The QB component obviously has to shrink as the proportion of self-archiving authors grows, since QB is based on self-selective differential self-archiving of only the higher-quality work.) (4) The true OA Advantage, OAA, which is probably by far the strongest in fields that are nearer to 0% OA than to 100% OA because OAA is a *relative* advantage (and a *competitive* one): In a non-OA field (unlike astro, which is 100% OA), *all* factors give the advantage to the self-archived article over the non-self-archived one (e.g., even postprints have the "Early Advantage"). So even if the pure OAA is destined to shrink to zero once 100% OA is reached, it is a *huge* advantage today, when OA is far from 100%. It means that authors have a great deal of competitive incentive to make their own articles OA now, before their competitors do. In other words, it's really a Prisoner's Dilemma, hence a horse race, once the odds and the causality are clearly understood! That is why we are so busily generating the OA advantage data across all disciplines in our collaborative ISI study in Southampton, Quebec and Oldenburg: http://citebase.eprints.org/isi_study/ http://www.crsc.uqam.ca/lab/chawki/sociologie.htm http://www.crsc.u
Re: A Search Engine for Searching Across Distributed Eprint Archives
On Wed, 20 Oct 2004, Donat Agosti wrote: > Something, which bothers me and doesn't show up in most of the > discussion of open access, is the construction of search tools across > digital publications (and potentially millions of pages of legacy > information). In the end, this will be the real issue, not just reading > another publication face to face. The real issue -- and the 1st, 2nd, 3rd and Nth priority today -- is Open Access (OA) *content*: The full-texts of the 2.5 million annual articles published in the world's 24,000 peer-reviewed journals are still not openly accessible online (only about 20% of them are). It is merely distraction and dreaming to worry about search tools when the OA content is not yet there for them to search! Having said that, cross-archive search tools (for the little OA content we have so far) already *do* exist (and they are already far more powerful than their sparse content yet deserves!): http://oaister.umdl.umich.edu/o/oaister/ http://citebase.eprints.org/ http://www.scirus.com/srsapp/ And (I promise you), providing more OA content is guaranteed to inspire the creation of more and more such tools, with more and more powerful capacities. So please, don't worry about more powerful search tools when the cupboards are still bare: Fill the cupboards and the search tools will come, hungrily! > What do you think about that? It seems, that the big publishing houses > are already thinking about that, and that they developed such facilities. The big publishing houses' cupboards are *not* bare: They have the 100% Toll Access content on which to provide ever more powerful search tools. Let's provide 100% Open Access content and then watch what happens! > This of course is one of the most important tools, for data > mining, extraction, or just finding the right piece of information. It > also means, that we look beyond self-archived pdf documents to searchable > documents with some mark up of their logic content included. Any ideas? Two ideas: (1) Provide the full-text Open Access content, and the tools for finding, mining and extracting from it will come with the territory. (2) The primary target is journal articles, which consist primarily of text. The most powerful means of text-processing today is full-text inversion. (This is part of the magic that google does.) Enhancing this with citation-linking (in place of google's ordinary linking), plus some hub/authority analysis, citation and download ranking, co-citation analysis, co-text (semantic/similarity) analysis, and full-text boolean search, and I think you will have search capabilities to surpass your wildest dreams. The only missing element is the content. Please let's not forget that, and lapse into Oneirology instead of Open Access Provision! Stevan Harnad
Re: A Search Engine for Searching Across Distributed Eprint Archives
Dear Stevan Attached a little report which appeared in today Science section of the Neue Zuercher Zeitung: http://www.nzz.ch/2004/10/20/ft/page-article9XKLV.html about http://www.oai.unizh.ch/symposium/program.html I am sorry, I couldn't make it. There was a second meeting in Bern on Biodiversity Issues, which has in fact a lot to do with the open access initiative. This meeting though was organized by life science, and not medical science, two branches of the Swiss Academy of Sciences Something, which bothers me and doesn't show up in most of the discussion of open access, is the construction of search tools across digital publications (and potentially millions of pages of legacy information). In the end, this will be the real issue, not just reading another publication face to face. What do you think about that? It seems, that the big publishing houses are already thinking about that, and that they developed such facilities. This of course is one of the most important tools, for data mining, extraction, or just finding the right piece of information. It also means, that we look beyond selfarchived pdf documents to searchable documents with some mark up of their logic content included. Any ideas? All the best, and thanks for all your efforts re open access Donat Dr. Donat Agosti Research Associate, American Museum of Natural History and Smithsonian Institution Email: ago...@amnh.org Web: http://anbase.org CV: http://research.amnh.org/entomology/social_insects/agosticv_2003.html Dalmaziquai 45 3005 Bern Switzerland +41-31-351 7152
Re: Eprints, Dspace, or Espace?
On Wed, 20 Oct 2004, Philip Hunter wrote: > The focus of each of the OAI-compliant archive-creating softwares is > different, as you acknowledge, since some are designed to archive digital > objects in general, not just eprints. The functionality of the different > softwares differs on this account, and therefore there is a choice to > be made between softwares. There is indeed. But Philip seems to have missed the point: This is an Open Access Forum, not an "Institutional Digital Asset Management Forum." Institutional Digital Asset Management is indeed an important and worthy issue. So is Research Funding, Public Health and World Hunger. But those are not what the Open Access Initiative is about! The Open Access Initiative is about providing toll-free, online, full-text access to the 2.5 million articles that appear annually in the world's 24,000 peer-reviewed journals in order to make them accessible to all their would-be users worldwide -- irrespective of whether their institutions can afford to subscribe to the journal in which each article appears -- and thereby maximising the research impact of each article, its author, its author's institution, and its author's research funder. It is not about Institutional Digital Asset Management. Budapest Open Access Initiative http://www.soros.org/openaccess/read.shtml "The literature that should be freely accessible online is that which scholars give to the world without expectation of payment. Primarily, this category encompasses their peer-reviewed journal articles... "An old tradition and a new technology have converged to make possible an unprecedented public good. The old tradition is the willingness of scientists and scholars to publish the fruits of their research in scholarly journals without payment, for the sake of inquiry and knowledge. The new technology is the internet. The public good they make possible is the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it by all scientists, scholars, teachers, students, and other curious minds. Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be... My reply to the student's inquiry about which OAI archive-creating software to use was based entirely on the fact that the inquiry was addressed to me (and in the context of the American Scientist Open Access Forum). I am not, and never have been, a spokesman for Institutional Digital Asset Management (though I of course have nothing against that project, only the highest admiration for it). Nor was the GNU Eprints OAI-archive-creating software -- the first and most widely used of the OAI archive-creating softwares -- written for the sake of institutional digital asset management (although it can certainly be used for that purpose too). It was written for the sake of institutional Open Access self-archiving. And it was with respect to that objective that I told the student that all the softwares he listed were equivalent, and that what really mattered was the institution's adopting an effective policy for the self-archiving of all of its authors' journal article, so as to provide Open Access to it. http://www.arl.org/sparc/pubs/enews/aug01.html#6 I would add only -- though it is but a hypothesis -- that an institutional self-archiving policy that successfully generates Open Access to 100% of institutional journal article output is probably the single most important step an institution can take toward an eventual successful Institutional Digital Asset Management policy too, but I make no strong claims about this, as it is not my area of expertise, experience or interest. http://software.eprints.org/handbook/departments.php So, to repeat, although any of the OAI archive-creating softwares can indeed also be used for Institutional Digital Asset Management too, it is not their functional equivalence with respect to that application on which I was commenting, particularly, but their functional equivalence with respect to institutional Open Access content-provision, which is the theme of this Forum, and the goal of the Open Access Initiative. > All deposited papers have the same metadata tags? Your definition of an > eprint is not up to speed. The Open Archives site FAQ reminds us that > "the metadata harvesting protocol supports the notion of multiple > metadata sets, allowing communities to expose metadata in formats that > are specific to their applications and domains. The technical framework > places no limitations on the nature of such parallel sets, other than > that the metadata records be structured as XML data, which have a > corresponding XML schema for validation." > > http://www.openarchives.org/documents/FAQ.html The Open *Archives* In
Re: Eprints, Dspace, or Espace?
Stevan, you wrote: > All the main OAI-compliant archive-creating softwares are functionally > equivalent, because after all, what they do is quite simple: They make > sure that all deposited papers have the same metadata tags, the obvious > ones: author-name, article-title, date, journal-name, etc., so that they > are interoperable as well as harvestable by OAI service providers: The focus of each of the OAI-compliant archive-creating softwares is different, as you acknowledge, since some are designed to archive digital objects in general, not just eprints. The functionality of the different softwares differs on this account, and therefore there is a choice to be made between softwares. All deposited papers have the same metadata tags? Your definition of an eprint is not up to speed. The Open Archives site FAQ reminds us that "the metadata harvesting protocol supports the notion of multiple metadata sets, allowing communities to expose metadata in formats that are specific to their applications and domains. The technical framework places no limitations on the nature of such parallel sets, other than that the metadata records be structured as XML data, which have a corresponding XML schema for validation." http://www.openarchives.org/documents/FAQ.html > With DSpace (and SPARC) grew the "institutional repository" movement, and > many more archive softwares, most of which have only loose ties with the > OA movement, and are really intended for the showcasing and management > of all of a university's digital holdings, not only, or especially, > research journal articles and OA. As a consequence, "institutional > repositories" (IRs) are (slowly) filling today with all kinds of material, > very little of it being OA articles! And IRs tend to be focused more on > the preservation and curation of university digital holdings than on > providing immediate OA to all university research output so as to maximise its > research impact, which is what OA is for. Well perhaps the range of available softwares reflects what the user community actually wants. Always a valid point to consider. :-) Philip Philip Hunter, UKOLN Research Officer. UKOLN, University of Bath, Bath, BA2 7AY Tel: +44 (0) 1225 323 668 Fax: +44 (0) 1225 826838 Email: p.j.hun...@ukoln.ac.uk UKOLN: http://www.ukoln.ac.uk/ http://www.rdn.ac.uk/projects/eprints-uk/