hi Peter,

If many knowledge projects are advancing our knowledge through the means that 
you have described, surely there are others than the one you started yesterday? 
Can you provide a list or literature review of such studies?

My OA APC study uses data from different sources that do not have a common set 
of terms:
dataverse.scholarsportal.info/dataverse

If we had to restrict data collection to CC-BY licensed works this research 
could not be done, and to the extent it could be done, publishers who do not 
want us to study them could easily opt out by not using CC-BY licenses on the 
pages where this information is found. In other words CC-BY licenses raise 
issues for data collection analysis.

I would like to note some methodological concerns with such the approach 
described by PMC (automatically gathering data from tables).Taking data from 
different studies without fully accounting for difference in methods (eg 
definition or measurement) could easily lead to false conclusions. Worse, such 
false conclusions would be highly replicable leading to false confidence in 
results, ie anyone could repeat the same mistakes and come to the same 
conclusion of unknown external validity.

For the 2016/17 OA APC dataset I am adding a "providence" column because the 
data in the 2016 APC column comes from different researchers with some 
differences in data collection. Even in a single dataset, to analyze one needs 
to understand when you are comparing apples with apples or macintoshes with 
Spartans. Automating data analysis without full comprehension of the data 
strikes me as problematic.

best,

Heather Morrison



-------- Original message --------
From: Peter Murray-Rust <pm...@cam.ac.uk>
Date: 2017-01-24 4:27 AM (GMT-05:00)
To: "Global Open Access List (Successor of AmSci)" <goal@eprints.org>
Subject: Re: [GOAL] How much of the content in open repositories is able to 
meet the definition of open access?

There are many activities where CC BY or a more liberal licence (CC 0) is the 
only way that modern science can be done.

Many knowledge-based projects in science , technology, medicine, use thousands 
of documents a day to extract and publish science. (We started one yesterday at 
https://github.com/ContentMine/cm-ucl/ to extract data from tables in PDF. This 
will aim to analyse 1000 papers per day - and that limit is set by the licences 
- if we were allowed we could index 10,000 papers/day in all disciplines.

To do reproducible science it is critical that the raw data (in this case 
scientific articles) are made publicly available so that others can reproduce 
the work. Any friction such as writing to the author, reading a non-standard 
licence, etc. makes the project impossible. We are often limited to using the 
Open subset (CC BY) in EuropePMC. We cannot afford to put a single CC NC, CC 
ND, "unlicensed freely available" manuscript in the repository in case we are 
sent a take-down notice. That would destroy the whole experiment.

These experiments are part of the science of the future. If we had been allowed 
to use them it is liklely that the Ebola outbreak in Liberia would have been 
predicted (The Liberian government's assessment, not mine). Whether it would 
have been prevented we don't know, but at least it would not have been impeded 
by copyright and paywalls.

Put simply. Unless the scientific material is CC BY or CC 0 we cannot use it 
for knowledge-driven STM. I have estimated that the opportunity cost of this 
can run into billions of dollars.

Repositories do not work for science. They are fragmented, non-interoperable 
and covered with prohibitions on automatic re-use. I have not met scientists 
who are systematically using institutional repositories of data mining.

It seems that the desire of arts, humanities are in direct conflict with the 
needs of STM. I note that there are few scientists posting on this list. Maybe 
this division should be recognised and the STM community should continue with 
its own policies og CC BY and the rest use whatever commonality they can 
achieve.

There are no simple solutions where the law is concerned. Only CC BY gives 
certainty. CC NC and CC ND may be valuable for A+H but they are very difficult 
to operate in any area of endeavour.

I was told 12 years ago on this list that I should be patient and the Green 
program would deliver universal access and then I could start mining the 
literature. I have been patient but it hasn't happened. I am told that OpenAIRE 
still doesn't expose full-text.  We should recognize it and look for 
alternative solutions.




On Mon, Jan 23, 2017 at 7:55 PM, Heather Morrison 
<heather.morri...@uottawa.ca<mailto:heather.morri...@uottawa.ca>> wrote:
With all due respect to the people who created and shared the "how open is it" 
spectrum tool, I find some of the underlying assumptions to be problematic.

For example the extreme of closed access assumes that having to pay 
subscriptions, membership, pay per view etc. is the far end of closed. My 
perspective is that the opposite of open is closure of knowledge. Climate 
change denied, climate scientists muzzled, fired or harassed, climate change 
science defunded, climate data taken down and destroyed, deliberate spread of 
misinformation.

This is not a moot point. This end of the spectrum is a reality today, one that 
is far more concerning for many researchers than pay walls (not that I support 
paywalls).

Fair use in listed in a row named closed access. I argue that fair use / fair 
dealing is essential to academic work and journalism, and must apply to all 
works, not just those that can be subject to academic OA policy.

There is an underlying assumption about the importance and value of re-use / 
remix that omits any discussion of the pros, cons, or desirability of re-use / 
remix that I argue we should be having. Earlier today I mentioned some of the 
potential pitfalls. Now I would like to two potential pitfalls: mistranslation 
and errors in instructions for dangerous procedures.

There are dangers of poor published translations to knowledge per se (ie 
introduce errors) and to the author's reputation, ie an author could easily be 
indirectly misquoted due to a poor translation. There are good reasons why some 
authors and journals hesitate to grant  downstream translations permissions.  
Reader side translations (eg automated translation tools) are not the same as 
downstream published translations, although readers should be made aware of the 
current limitations of automated translation.

If people are copying instructions for potentially dangerous procedures  
(surgery, chemicals, engineering techniques), and they are not at least as 
expert as the original author, it might be in everyone's best interests if 
downstream readers are not invited and encouraged to manipulate the text, 
images, etc.

In creative works, eg to prepare a horror flick, by all means take this and 
that, mix it together and create something new and intriguing. I am not 
convinced that the same arguments ought to apply to works that might guide 
procedures in a real hospital operating room.

I suggest the "how open is it" spectrum is a useful exercise that has served a 
purpose for some but not a canon for all to adhere to.

best,

Heather Morrison



-------- Original message --------
From: David Prosser <david.pros...@rluk.ac.uk<mailto:david.pros...@rluk.ac.uk>>
Date: 2017-01-23 2:16 PM (GMT-05:00)
To: "Global Open Access List (Successor of AmSci)" 
<goal@eprints.org<mailto:goal@eprints.org>>
Subject: Re: [GOAL] How much of the content in open repositories is able to 
meet the definition of open access?

I rather like the ‘How open is it?’ tool that approaches this as a spectrum:

http://sparcopen.org/our-work/howopenisit/


I may be quite ‘hard line’, but I acknowledge that by moving along the spectrum 
a paper, monograph, piece of data (or whatever) becomes more open - and more 
open is better than less open.

If the funders have gone to the far end of the spectrum it is perhaps because 
they feel that the greatest benefits are there, not because they have been 
convinced that they have to follow the strict, ‘hard line’ definition of open 
access.

David



On 23 Jan 2017, at 18:30, Richard Poynder 
<richard.poyn...@gmail.com<mailto:richard.poyn...@gmail.com>> wrote:

Hi Marc,

You say:

"I certainly qualify as an OA advocate, and as such:

I don’t equate OA with CC BY (or any CC license); in fact, I’m a little bit 
tired of discussions about what 'being OA' means."

I hear you, but I think the key point here is that OA advocates (perhaps not 
you, but OA advocates) are successfully convincing a growing number of research 
funders (e.g. Wellcome Trust, RCUK, Ford Foundation, Hewlett Foundation, Gates 
Foundation etc.) that CC BY is the only acceptable form of open access.

So however tired you and Stevan might be of discussing it, I believe there are 
important implications and consequences flowing from that.

Richard Poynder



On 23 January 2017 at 16:31, Couture Marc 
<marc.cout...@teluq.ca<mailto:marc.cout...@teluq.ca>> wrote:
Hi all,

Just to be clear, my position on the basic issue here.

I certainly qualify as an OA advocate, and as such :

- I don’t equate OA with CC BY (or any CC license); in fact, I’m a little bit 
tired of discussions about what “being OA” means.

- I work to help increase the proportion of gratis OA, still much too low.

- I try to convince my colleagues that CC BY is the best way to disseminate 
scientific/scholarly works and make them useful.

I favour CC BY over the restricted versions (mainly -NC) because I find the 
arguments about potentially unwanted or devious uses far less compelling than 
those about the advantages of unrestricted uses and the drawbacks of 
restrictions that can be much more stringent than they seem at first glance.

Like Stevan said, OA advocates are indeed a plurality. The opposite would 
bother me.

Marc Couture



_______________________________________________
GOAL mailing list
GOAL@eprints.org<mailto:GOAL@eprints.org>
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal




--
Richard Poynder
www.richardpoynder.co.uk<http://www.richardpoynder.co.uk/>
_______________________________________________
GOAL mailing list
GOAL@eprints.org<mailto:GOAL@eprints.org>
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal


_______________________________________________
GOAL mailing list
GOAL@eprints.org<mailto:GOAL@eprints.org>
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal




--
Peter Murray-Rust
Reader Emeritus in Molecular Informatics
Unilever Centre, Dept. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
_______________________________________________
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal

Reply via email to