** Cross-Posted ** On 9-Aug-07, at 7:22 AM, [identity deleted] wrote:
> I have been commissioned to write a news story for [publication > name deleted] inspired by your post of 12 July on the American > Scientist Forum regarding the HEFCE's RAE being out of touch. > http://users.ecs.soton.ac.uk/harnad/Hypermail/Amsci/6542.html > > I would welcome your comments on this, especially on how you > consider the RAE may be out of touch on the wider issue of OA as > well as the CD/PDF issue. It is not that the RAE is altogether out of touch. First let me count the things that they are doing right: (1) It is a good idea to have a national research performance evaluation to monitor and reward research productivity and progress. Other countries will be following and eventually emulating the UK's lead. (Australia is already emulating it.) http://openaccess.eprints.org/index.php?/archives/226-guid.html (2) It is also a good idea to convert the costly, time-consuming, and wasteful (and potentially biased) panel-based RAE of past years to an efficient, unbiased metric RAE, using objective measures that can be submitted automatically online, with the panel's role being only to monitor and fine-tune. This way the RAE will no longer take UK researchers' precious time away from actually doing UK research in order to resubmit and locally "re-peer-review" work that has already been submitted, published and peer-reviewed, in national and international scholarly and scientific journals. http://www.ariadne.ac.uk/issue35/harnad/ But, as with all policies that are being shaped collectively by disparate (and sometimes under-informed) policy-making bodies, a few very simple and remediable flaws in the reformed RAE system have gone detected and hence uncorrected. They can still be corrected, and I hope they will be, as they are small, easily fixed flaws, but, if left unfixed, they will have huge negative consequences, compromising the RAE as well as the RAE reforms: (a) The biggest flaw concerns the metrics that will be used. Metrics first have to be tested and validated, discipline by discipline, to ensure that they are valid indicators of research performance. Since the UK has relied on the RAE panel evaluations for 2 decades, and since the last RAE (2008) before conversion to metrics is to be a parallel panel/metrics exercise, the natural thing to do is to test as many candidate metrics as possible in this exercise, and to cross- validate them against the rankings given by the panels, separately, in each discipline. (Which metrics are valid performance indicators will differ from discipline to discipline.) All indications so far are that this cross-validation exercise is *not* what RAE 2008 and HEFCE are planning to do. Instead, there is a focus on a few pre-selected metrics, rather than the very rich spectrum of potential metrics that could be tested. The two main pre-selected metrics are (i) prior research funding and (ii) citation counts. (i) Prior research funding has already been shown to be extremely highly correlated with the RAE panel rankings in a few (mainly scientific) disciplines, but this was undoubtedly because the panels, in making their rankings, already had those metrics in hand, hence could themselves have been explicitly counting them in making their judgments! Now, although a correlation between metrics and panel rankings is desirable initially, because that is the way to launch and validate the choice of metrics, in the case of this particular metric there is not only a potential interaction, indeed a bias, that makes the two (the metric and the panel ranking) non-independent, and hence invalidates the test of this metric's validity, but there is another, even deeper reasoning for not putting a lot of weight on the prior-funding metric: The UK has a Dual System for research funding: (A) competitive individual researcher project proposals and (B) the RAE panel rankings (awarding top-sliced research funding to University Departments, based on their research performance). The prior-funding metric is determined largely by (A). If it is also given a heavy weight in (B) then that is not improving the RAE [i.e., (B)]: It is merely collapsing the UK's Dual System into (A) alone, and doing away with the RAE altogether. As if this were not bad enough, the prior- funding metric is not even a valid metric for many of the RAE disciplines. (ii) Citations counts are a much better potential candidate metric. Indeed, in many of the RAE disciplines, citation counts have already been tested and shown to be correlated with the panel rankings, although not nearly as highly correlated as prior funding (in those few disciplines where prior funding is indeed highly correlated). The somewhat weaker correlation in the case of the citation metric is a good thing, because it leaves room for other metrics to contribute to the assessment outcome too. It is unlikely, and undesirable, to expect performance evaluation to be based on a single metric. But citation counts are certainly a strong candidate for serving as a particularly important one among the array of many metrics to be validated and used in future RAEs. Citation counts also have the virtue that they were not explicitly available to the RAE panels when they made their rankings (indeed, it was explicitly forbidden to submit or count citations). So their correlation with the RAE panel rankings is a genuine empirical correlation rather than an explicit bias. So the prior-funding metric (i) needs to be used cautiously, to avoid bias and self-fulfilling prophecy, and the citation-count metric (ii) is a good candidate, but only one of many potential metrics that can and should be tested in the parallel RAE 2008 metric/panel exercise. (Other metrics include co-citation counts, download counts, download and citation growth and longevity counts, hub/authority scores, interdisciplinarity scores, and many other rich measures for which RAE 2008 is the ideal time to do the testing and validation, discipline by disciplines -- as it is virtually certain that disciplines will differ in which metrics are predictive for them, and what the weightings of each metric should be.) Harnad, S. (2007) Open Access Scientometrics and the UK Research Assessment Exercise. In Proceedings of 11th Annual Meeting of the International Society for Scientometrics and Informetrics 11(1), pp. 27-33, Madrid, Spain. Torres-Salinas, D. and Moed, H. F., Eds. http://eprints.ecs.soton.ac.uk/13804/ Shadbolt, N., Brody, T., Carr, L. and Harnad, S. (2006) The Open Research Web: A Preview of the Optimal and the Inevitable, in Jacobs, N., Eds. Open Access: Key Strategic, Technical and Economic Aspects, Chandos. http://eprints.ecs.soton.ac.uk/12453/ Brody, T., Carr, L., Gingras, Y., Hajjem, C., Harnad, S. and Swan, A. (2007) Incentivizing the Open Access Research Web: Publication-Archiving, Data-Archiving and Scientometrics. CTWatch Quarterly 3(3). http://eprints.ecs.soton.ac.uk/14418/01/ctwatch.html Yet it looks as if RAE 2008 and HEFCE are not currently planning to commission this all-important validation exercise of metrics against panel rankings for a rich array of candidate metrics. This is a huge flaw and oversight, though it could still be easily remedied by going ahead and doing such a systematic cross-validation study after all. For such a systematic metric/panel cross-validation study in RAE 2008, however, the array of candidate metrics has to be made as rich and diverse as possible. The RAE is not currently making any effort to collect as many potential metrics as possible in RAE 2008, and this is partly because it is overlooking the growing importance of online, Open Access metrics -- and indeed overlooking the growing importance of Open Access itself, both in research productivity and progress itself, and in evaluating it. This brings us to the second flaw in HEFCE's RAE 2008 plans: (b) For no logical or defensible reason at all, RAE 2008 is insisting that researchers submit the publishers' PDFs for the 2008 exercise. Now it is progress that RAE are accepting electronic drafts rather than requiring hard copy, as in past years. But in insisting that the electronic drafts must be the publisher's PDF, they create two unnecessary problems. One unnecessary problem, a minor one, is that the RAE imagines that in order to have the publisher's PDF for evaluation, they need to seek (or even pay for) permission from the publisher. This is complete nonsense! *Researchers* (i.e., the authors) submit their own published work to the RAE for evaluation. For the researchers, this is Fair Dealing (Fair Use) and no publisher permission or payment whatsoever is needed. (As it happens, I believe HEFCE has worked out a "special arrangement" whereby publishers "grant permission" and "waive payment." But the completely incorrect notion that permission or payment were even at issue, in principle, has an important negative consequence, which I will now describe.) What HEFCE should have done -- instead of mistakenly imagining that it needed permission to access the papers of UK researchers for research evaluation -- was to require researchers to deposit their peer-reviewed, revised, accepted final drafts in their own University's Institutional Repositories (IRs) for research assessment. The HEFCE panels could then access them directly in the IRs for evaluation. This would have ensured that all UK research output was deposited in each UK researcher's university IR. There is no publisher permission issue for the RAE: The deposits can, if desired, be made Closed Access rather than Open Access, so only the author, the employer and the RAE panels can access the full text of the deposit. That is Fair Dealing and requires absolutely no permission from anyone. But, as a bonus, requiring the deposit of all UK research output (or even just the 4 "best papers" that are currently the arbitrary limit for RAE submissions) into the researcher's IR for RAE evaluation would have ensured that 62% of those papers could immediately have been made OA (because 62% of journals already endorse immediate OA self-archiving) http://romeo.eprints.org/stats.php And for the remaining 38% this would have allowed each IR's "Fair Use" button to be used by researchers webwide to request an individual email copy semi-automatically (with these "eprint requests" provide a further potential metric, along with download counts). http://openaccess.eprints.org/index.php?/archives/274-guid.html Instead, HEFCE needlessly insisted on the publisher's PDF (which, by the way, could likewise have been deposited by all authors in their IRs, as Closed Access, without needing any permission from their publishers) being submitted to RAE directly. This effectively cut off not only a rich potential source of RAE metrics, but a powerful incentive for providing OA, which has been shown, in itself, to increase downloads and citations directly in all disciplines. http://opcit.eprints.org/oacitation-biblio.html In summary, 2 good things -- (1) research performance itself, and (2) conversion to metrics -- plus 2 bad things -- (3) failure to explicitly provide for the systematic evaluation of a rich candidate spectrum of metrics against the RAE 2008 panel rankings and (4) failure to require deposit of the authors' papers in their own IRs, to generate more OA metrics, more OA, and more UK research impact. The good news is that there is still time to fully remedy (3) and (4) if only policy-makers take a moment to listen, think it through, and do the little that needs to be done to fix it. I am hoping that this will still happen -- and even your article could help make it happen! Stevan Harnad PS To allay a potential misunderstanding: It is definitely *not* the case that the RAE panel rankings are themselves infallible or face-valid! The panelists are potentially biased in many ways. And RAE panel review was never really "peer review," because peer review means consulting the most qualified specialists in the world for each specific paper, whereas the panels are just generic UK panels, evaluating all the UK papers in their discipline: It is the journals who already conducted the peer review. So metrics are not just needed to put an end to the waste and the cost of the existing RAE, but also to try to put the outcome on a more reliable, objective, valid and equitable basis. The idea is not to *duplicate* the outcome of the panels, but to improve it. Nevertheless -- and this is the critical point -- the metrics *do* have to be validated, and, as an essential first step, they have to be cross-validated against the panel rankings, discipline by discipline. For even though those panel rankings are and always were flawed, they are what the RAE has been relying upon, completely, for 2 decades. So the first step is to make sure that the metrics are chosen and weighted, to get as close an approximation to the panel rankings as possible, discipline by discipline. Then, and only then, can the "ladder" of the panel-rankings -- which got us where we are -- be tossed away, allowing us to rely on the metrics alone -- which can then be calibrated and optimised in future years, with feedback from future meta-panels that are monitoring the rankings generated by the metrics and, if necessary, adjusting and fine-tuning the metric weights or even adding new, still-to-be-discovered-and-tested metrics to them. In sum: despite its warts, the current RAE panel rankings need to be used to bootstrap the new metrics into usability. Without that prior validation based on what has been used until now, the metrics are just hanging from a skyhook and no one can say whether or not they measure what the RAE panels have been measuring until now. Without validation, there is no continuity in the RAE and it is not really a "conversion" to metrics, but simply an abrupt switch to another, untested assessment tool. (Citation counts have been tested elsewhere, in other fields, but as there has never been anything of the scope and scale of the UK RAE, across all disciplines in an entire country's research output, the prior patchwork testing of citation counts as research performance indicators is nowhere near providing the evidence that would be needed to make a reliable, valid choice of metrics for the UK RAE: only cross-validation within the RAE parallel metric/panel exercise itself can provide that kind of evidence, and the requisite continuity for a smooth, rational transition from panel rankings to metrics.)