Re: McCabe and Snyder respond to criticism on their paper

Stevan Harnad Fri, 11 Feb 2011 09:14:01 -0500

On 2011-02-10, at 5:26 PM, Philip Davis wrote:

> "After reading the original post regarding our paper, and the 
> subsequent comments, I thought it would be appropriate to address 
> the issue that is generating some heat here, namely whether our 
> results can be extrapolated to the OA environment..."
> 
> read full response here:
> http://j.mp/hGSY6Z


MCCABE: â¦I thought it would be appropriate to address the issue that is 
generating some heat here, namely whether our results can be extrapolated to 
the OA environmentâ¦. (1) Selection bias and other empirical modeling errors 
are likely to have generated overinflated estimates of the benefits of online 
access (whether free or paid) on journal article citations in most if not all 
of the recent literature.

If "selection bias" refers to author bias toward selectively making their 
better (hence more citeable) articles OA, then this was controlled for in the 
comparison of self-selected vs. mandated OA, by Gargouri et al (uncited in the 
M & S article, but known to the authors -- indeed the first author requested, 
and received, the entire dataset for further analysis: we are all eager to hear 
the results).

If "selection bias" refers to the selection of the journals for analysis, I 
cannot speak for studies that compare OA journals with non-OA journals, since 
we only compare OA articles with non-OA articles within the same journal. And 
it is only a few studies like Evans and Reimer's, that compare citation rates 
for journals before and after they are made accessible online (or, in some 
cases, freely accessible online). Our principal interest is in the effects of 
immediate OA rather than delayed or embargoed OA (although the latter may be of 
interest to the publishing community).

MCCABE: 2. There are at least 2 âflavorsâ found in this literature: 1. 
papers that use cross-section type data or a single observation for each 
article (see for example, Lawrence (2001), Harnad and Brody (2004), Gargouri, 
et. al. (2010)) and 2. papers that use panel data or multiple observations over 
time for each article (e.g. Evans (2008), Evans and Reimer (2009)).

We cannot detect any mention or analysis of the Gargouri et al. paper in the M 
& S paperâ¦

MCCABE: 3. In our paper we reproduce the results for both of these approaches 
and then, using panel data and a robust econometric specification (that 
accounts for selection bias, important secular trends in the data, etc.), we 
show that these results vanish.

We do not see our results cited or reproduced. Does "reproduced" mean 
"simulated according to an econometric model"? If so, that is regrettably too 
far from actual empirical findings to be anything but speculations about what 
would be found if one were actually to do the empirical studies.

MCCABE: 4. Yes, we âonlyâ test online versus print, and not OA versus 
online for example, but the empirical flaws in the online versus print and the 
OA versus online literatures are fundamentally the same: the failure to 
properly account for selection bias. So, using the same technique in both cases 
should produce similar results.

Unfortunately this is not very convincing. Flaws there may well be in the 
methodology of studies comparing citation counts before and after the year in 
which a journal goes online. But these are not the flaws of studies comparing 
citation counts of articles that are and are not made OA within the same 
journal and year.

Nor is the vague attribution  of "failure to properly account for selection 
bias" very convincing, particularly when the most recent study controlling for 
selection bias (by comparing self-selected OA with mandated OA) has not even 
been taken into consideration.

Conceptually, the reason the question of whether online access increases 
citations over offline access is entirely different from the question of 
whether OA increases citations over non-OA is that (as the authors note), the 
online/offline effect concerns *ease* of access: Institutional users have 
either offline access or online access, and, according to M & S's results, in 
economics, the increased ease of accessing articles online does not increase 
citations. 

This could be true (although the growth across those same years of the tendency 
in economics to make prepublication preprints OA [harvested by RepEc] through 
author self-archiving, much as the physicists had started doing a decade 
earlier in Arxiv, and computer scientists started doing even earlier [later 
harvested by Citeseerx] could be producing a huge background effect not taken 
into account at all in M & S's painstaking temporal analysis.

But any way one looks at it, there is an enormous difference between comparing 
easy vs. hard access (online vs. offline) is hardly the same thing as comparing 
access with no access -- as it is when we compare OA vs non-OA for all those 
potential users that are at institutions that cannot afford subscriptions 
(whether offline or online) to the journal in which an article appears. The 
barrier, in other words (though one should hardly have to point this out to 
economists) is not an ease barrier but a price barrier: Non-OA articles are not 
just harder to access for users at nonsubscribing institutions: They are 
*impossible* to access unless a price is paid.

(I certainly hope the M & S will not reply with "let them use interlibrary loan 
(ILL)"! A study analogous to M & S's online/offline study comparing citations 
for offline vs. online vs. ILL access in the click-through age would not only 
strain belief if it too found no difference, but it too would fail to address 
OA, since OA is about access when one has reached the limits of one's 
institutions subscription/license/pay-per-view budget. Hence it would again 
miss all the citations that an article would have gained it is had been 
accessible to all its potential users and not just those whose institutions 
could afford access, by whatever means.)

It is ironic that M & S draw their conclusions about OA in economic terms (and, 
predictably, as their interest is in modelling publication economics) in terms 
of the cost/benefits of paying to publish in an OA journal for an author, 
concluding that since they have shown it will not generate more citations, it 
is not worth the money.

But the most compelling findings on the OA citation advantage come from OA 
author self-archiving (of articles published in non-OA journals), not from OA 
journal publishing. Those are the studies that show the OA citation advantage, 
and the advantage does not cost the author a penny!

And the extra citations are almost certainly coming from users for whom access 
to the article would otherwise have been financially prohibitive. (Perhaps it's 
time for econometric modeling from the user's point of view tooâ¦)

I recommend that M & S look at the studies of Michael Kurtz in astrophysics. 
Those too were sophisticated long-term studies of the effect of the wholesale 
switch from offline to online, and Kurtz found that total citations were in 
fact slightly reduced, overall! But astrophysics, too, is a field in which OA 
self-archiving is widespread. Hence whether and when journals go online is 
moot, insofar as citations are concerned. (The likely hypothesis for the 
reduced citations -- compatible also with our own findings in Gargouri et al -- 
is that OA levels the playing field for users: OA articles are accessible to 
everyone, not just those whose institutions can afford toll access. As a 
result, users can *self-selectively* decide to cite only the best and most 
relevant articles of all, rather than having to do with a selection among only 
the ones to which their institutions can afford toll access. -- Corollary of 
this [though probably also a spinoff of the Seglen/Pareto effect] is that the 
biggest beneficiaries of the OA citation advantage will be the best articles.)

MCCABE: 5. At least in the case of economics and business titles, it is not 
even possible to properly test for an independent OA effect by specifically 
looking at OA journals in these fields since there are almost no titles that 
*switched* from print/online to OA (I can think of only one such title in our 
sample that actually permitted backfiles to be placed in an OA repository). 
Indeed, almost all of the OA titles in econ/business have always been OA and so 
no statistically meaningful before and after comparisons can be performed.

The multiple conflation here is so flagrant that it is almost laughable. Online 
â  OA and OA â  OA journal. 

First, the method of comparing the effect on citations before vs. after the 
offline/online *switch* will have to make do with its limitations. (We don't 
think it's of much use for studying OA effects at all.) The method of comparing 
the effect on citations of OA vs. non-OA within the same (economics/business, 
toll-access) journals can certainly proceed apace in those disciplines, the 
studies have been done, and the results are much the same as in other 
disciplines. 

M & S have our latest dataset: Perhaps they would care to test whether the 
economics/business subset of it is an exception to our finding that (a) there 
is a significant OA advantage in all disciplines, and (b) it's just as big when 
the OA is mandated as when it is self-selected.

MCCABE: 6. One alternative, in the case of cross-section type data, is to 
construct field experiments in which articles are randomly assigned OA status 
(e.g. Davis (2008) employs this approach and reports no OA benefit).

And another one -- based on an incomparably larger N, across far more fields -- 
is the Gargouri et al study that M & S fail to mention in their article, and 
for which they have the full dataset in hand, as requested. 

MCCABE: 7. Another option is to examine articles before and after they were 
placed in OA repositories, so that the likely selection bias effects, important 
secular trends, etc. can be accounted for (or in economics jargon, 
âdifferenced outâ). Evans and Reimerâs attempt to do this in their 2009 
paper but only meet part of the econometric challenge.

M & S are rather too wedded to their before/after method and thinking! The 
sensible time for authors to self-archive their papers is immediately upon 
acceptance for publication. That's before the published version has even 
appeared. Otherwise one is not studying OA but OA embargo effects. (But let me 
agree on one point: Unlike journal publication dates, OA self-archiving dates 
are not always known or taken into account; so there may be some drift there, 
depending on when the author self-archives. The solution is not to study the 
before/after watershed, but to focus on the articles that are self-archived 
immediately rather than later.

Stevan Harnad

Gargouri, Y., Hajjem, C., Lariviere, V., Gingras, Y., Brody, T., Carr, L. and 
Harnad, S. (2010) Self-Selected or Mandated, Open Access Increases Citation 
Impact for Higher Quality Research. PLOS ONE. 
http://eprints.ecs.soton.ac.uk/18493/

Re: McCabe and Snyder respond to criticism on their paper

Reply via email to