Re: [GOAL] [SCHOLCOMM] Evaluation and metrics: why not (a critical perspective)

2019-10-23 Thread Heather Morrison
Thank you for your contribution, Lizzie!

I love this quote from your The blind and the elephant... "
“When using any indicator for purposes that have rewards attached – especially 
when the entity is small – you should use metrics with extreme care.”

I agree. ​This is a key point that I am trying to make. Metrics based on 
substantive collective knowledge make sense. Let's aim to achieve the CO2 
emissions reductions that are needed to avoid catastrophic climate change (and 
wouldn't it be nice if newspapers decided to report our collective progress on 
this prominently on a daily basis)? Similarly, we can assess our collective 
progress in preventing and treating cancer through epidemiological data.

However, evaluating an individual scholar or scholarly article on the basis of 
citations is problematic because there are rewards attached for the small 
entity, the individual scholar - from job loss to promotion, prestige, grant 
funding. This creates an incentive to overstate positive findings, understate 
limitations, see patterns in data that aren't really there, and even to commit 
fraud. Less (or no) reliance on metrics in this case would be in the best 
interests of advancing our collective knowledge.

Metrics (like most things) are neither good nor bad in and of themselves. 
Whether metrics are beneficial or otherwise depends on who is using them, how, 
and for what purpose. Some beneficial examples of metrics (from my perspective) 
are to understand and ameliorate bias in hiring, salaries, grant funding, etc.

best,


Dr. Heather Morrison

Associate Professor, School of Information Studies, University of Ottawa

Professeur Agrégé, École des Sciences de l'Information, Université d'Ottawa

Principal Investigator, Sustaining the Knowledge Commons, a SSHRC Insight 
Project

sustainingknowledgecommons.org

heather.morri...@uottawa.ca

https://uniweb.uottawa.ca/?lang=en#/members/706

[On research sabbatical July 1, 2019 - June 30, 2020]


From: Elizabeth Gadd 
Sent: Wednesday, October 23, 2019 3:54 AM
To: Heather Morrison 
Cc: scholc...@lists.ala.org ; Global Open Access List 
(Successor of AmSci) ; Julie Bayley ; 
g.derr...@lancaster.ac.uk 
Subject: Re: [SCHOLCOMM] Evaluation and metrics: why not (a critical 
perspective)

Attention : courriel externe | external email
Hi Heather

Thanks for your email!  A few thoughts:

1) The UK has given the notion of measuring impact quite a lot of thought, 
having had this measured as part of their national research assessments since 
2009. I would refer you to the brilliant work by Julie Bayley 
(https://juliebayley.blog/  ) and Gemma Derrick 
(https://www.palgrave.com/gp/book/9783319636269) (copied in) in this space for 
a full exploration of all the issues, including the negative impacts you 
describe, something Gemma’s group have termed ‘grimpacts’.
2) With regards to your statement that the assessing of scholarly work does not 
require metrics, I would refer you to a piece I’ve written called ‘The Blind 
and the elephant: bringing clarity to our conversations about responsible 
metrics’. 
(https://thebibliomagician.wordpress.com/2019/05/15/the-blind-and-the-elephant-bringing-clarity-to-our-conversations-about-responsible-metrics/)
  In it I argue that we need to be a bit careful about sweeping statements 
about metrics, because there are many reasons we evaluate research (I name six) 
and at many different levels of granularity (individual, group, country, etc). 
In some settings, the use of metrics can be helpful, in others not. I would 
always generally argue that metrics + peer review give us the best chance of 
responsible assessment, as metrics can mitigate against unconscious bias in 
peer review.
3) To find other examples of best practice in research evaluation, DORA are 
compiling these on their website 
(https://sfdora.org/good-practices/research-institutes/).
4) For more discussion on these issues there are a number of dedicated 
discussion lists now. In the US, there is the RESMETIG list; in Canada the 
BRICS group; in the UK the LIS-Bibliometrics group, and finally an 
international working group looking at research evaluation called INORMS 
Research Evaluation Working Group.

I hope this is helpful?

All best
Lizzie

Dr Elizabeth Gadd
Research Policy Manager (Publications)
Research Office
Loughborough University
Loughborough, Leicestershire, UK

T: +44 (0)1509 228594
S: lizziegadd
E: e.a.g...@lboro.ac.uk


On 22 Oct 2019, at 22:04, Heather Morrison  wrote:



Rigorous scholarly work requires periodic assessment of our underlying 
assumptions. If these are found to be incorrect, then any logical arguments or 
empirical work based on these assumptions should be questioned.


Assumptions underlying metrics-based evaluation include:

  1.  impact is a quality of good scholarship at the level of individual works
  2.  aiming for impact is desirable in scholarly work

Let's consider the logic 

Re: [GOAL] [SCHOLCOMM] Evaluation and metrics: why not (a critical perspective)

2019-10-23 Thread Elizabeth Gadd
Hi Heather

Thanks for your email!  A few thoughts:

1) The UK has given the notion of measuring impact quite a lot of thought, 
having had this measured as part of their national research assessments since 
2009. I would refer you to the brilliant work by Julie Bayley 
(https://juliebayley.blog/  ) and Gemma Derrick 
(https://www.palgrave.com/gp/book/9783319636269) (copied in) in this space for 
a full exploration of all the issues, including the negative impacts you 
describe, something Gemma’s group have termed ‘grimpacts’.
2) With regards to your statement that the assessing of scholarly work does not 
require metrics, I would refer you to a piece I’ve written called ‘The Blind 
and the elephant: bringing clarity to our conversations about responsible 
metrics’. 
(https://thebibliomagician.wordpress.com/2019/05/15/the-blind-and-the-elephant-bringing-clarity-to-our-conversations-about-responsible-metrics/)
  In it I argue that we need to be a bit careful about sweeping statements 
about metrics, because there are many reasons we evaluate research (I name six) 
and at many different levels of granularity (individual, group, country, etc). 
In some settings, the use of metrics can be helpful, in others not. I would 
always generally argue that metrics + peer review give us the best chance of 
responsible assessment, as metrics can mitigate against unconscious bias in 
peer review.
3) To find other examples of best practice in research evaluation, DORA are 
compiling these on their website 
(https://sfdora.org/good-practices/research-institutes/).
4) For more discussion on these issues there are a number of dedicated 
discussion lists now. In the US, there is the RESMETIG list; in Canada the 
BRICS group; in the UK the LIS-Bibliometrics group, and finally an 
international working group looking at research evaluation called INORMS 
Research Evaluation Working Group.

I hope this is helpful?

All best
Lizzie

Dr Elizabeth Gadd
Research Policy Manager (Publications)
Research Office
Loughborough University
Loughborough, Leicestershire, UK

T: +44 (0)1509 228594
S: lizziegadd
E: e.a.g...@lboro.ac.uk


On 22 Oct 2019, at 22:04, Heather Morrison  wrote:



Rigorous scholarly work requires periodic assessment of our underlying 
assumptions. If these are found to be incorrect, then any logical arguments or 
empirical work based on these assumptions should be questioned.


Assumptions underlying metrics-based evaluation include:

  1.  impact is a quality of good scholarship at the level of individual works
  2.  aiming for impact is desirable in scholarly work

Let's consider the logic and an example.


  1.  Is impact a good thing? Consider what "impact" means in other contexts. 
Hurricanes and other natural disasters have impact; when we seek to work in 
harmony with the environment, we try to avoid impact. "Impact" is not 
essentially tied to the quality of "good".
  2.  Is aiming for impact at the level of individual scholarly works 
desirable? According to Retraction Watch, one of the top 10 most highly cited 
papers includes "the infamous Lancet paper by Andrew Wakefield that originally 
suggested a link between autism and childhood 
vaccines"
 (from: 
https://retractionwatch.com/the-retraction-watch-leaderboard/top-10-most-highly-cited-retracted-papers/).
 This article has been highly cited in academic papers both before and after 
retraction, widely quoted in traditional and social media, and I argue can 
demonstrate real-world impact (in the form of the return of childhood diseases 
that were on track to worldwide eradication) that is truly exceptional. Any way 
you measure impact, this article had it. Could this be a fluke? I argue that 
there are logical reasons why this is would not be a fluke. When researchers 
are rewarded for impact, this is an incentive to overstate the conclusions, see 
positive and interesting results beyond what the data shows, and even outright 
fraud.

It is important to distinguish the consequences of impact at the level of an 
individual research work and scholarly consensus based on a substantial body of 
evidence (such as climate change).

It is also important to consider some of the implications of metrics-based 
evaluation on individual scholars. Social biases such as those based on gender, 
ethnic origin, and Western centrism are common in our society, including in 
academia. There is some recognition of this is traditional academic work and 
some work to counter bias (such as blind reviews), however this cannot be 
controlled in the downstream academic environment and it seems obvious that 
metrics that go beyond academic citations will tend to amplify such biases.

Evaluation of the quality of scholarly work does not require metrics. Anyone 
who is a researcher needs to do a great deal of reading and assessment of