Question and challenge

Question: Didn't the UK recently change its legislation explicitly to allow for 
data and text mining?

Challenge: My research blog and data verses are both fully open with no CC 
license at all. They are All Rights Reserved, and yet posted on the web, in the 
case of the dataverse deliberately so that people can go ahead and download and 
manipulate the data.  I challenge anyone to go ahead and try some text and data 
mining. If you think there are legalities preventing you from doing this, 
please explain what they are.

Blog:  sustainingknowledgecommons.org<http://sustainingknowledgecommons.org>
OA APCs: http://dataverse.scholarsportal.info/dvn/dv/oaapc/

There probably are some barriers to text and data mining, however these have 
nothing to do with legalities. For example, this morning I was looking for Walt 
Crawford's comment on one of my posts. This didn't come up, but that's likely 
just because Wordpress is not set up to search comments.

I think we need to understand what barriers exist to data and text mining and 
resolve them, rather than assuming that pushing everyone to make their work 
CC-BY is the answer. For example, if my blog were CC-BY licensed, this wouldn't 
help with Wordpress not being set up to search the comments. Another example: 
there is nothing to stop the Licensor (as opposed to the downstream user) to 
put TPMs in a CC-BY or CC-0 work that would effectively prevent people from 
data and text mining.

If one is legally prevented from data and text mining works that are in the 
open, no doubt as a law-abiding citizen you're not using any internet search 
engine.

In my field, metadata is far more critical than legalities. I am sure that this 
is the case for other researchers. If others are doing work on journals, please 
include the title and ISSN - especially the ISSN as the key piece of data to 
facilitate remix in this particular area. A dataset that is CC-BY or CC-0 
without this information is of little to no use. This is the kind of discussion 
I think we need to have with respect to re-use.

best,

Heather


On 2015-06-01, at 10:59 AM, Peter Murray-Rust 
<pm...@cam.ac.uk<mailto:pm...@cam.ac.uk>>
 wrote:


We are now at the point where anything less than full BOAI-compliance is 
seriously holding science and medicine back. We must have immediate

"free availability on the public internet, permitting any users to read, 
download, copy, distribute, print, search, or link to the full texts of these 
articles, crawl them for indexing, pass them as data to software,..."

We've just run a workshop in Edinburgh in the Neuroscience group who are, inter 
alia, looking at Systematic review of animal experiments. One senior post doc 
has spent the last year reading 30,000 papers (sic) - that's one every 3 
minutes - classifying them into properly reported and badly reported tests. Our 
(Open) contentmine.org<http://contentmine.org/> Text and Data Mining software 
can do this in a few seconds per paper. But ONLY if we are legally allowed to 
do this; and the only licences that allow this explicitly are CC-BY or CC0. (I 
have spent a considerable time on the legal aspects).

The main STM publishers are challenging the right to Mine Content and throwing 
money at lobbying MEPs and European Commission to have restrictive clauses 
added to potential leglislation. The primary defence against this in almost all 
countries is to have science and medicine published as BOAI-compliant CC-BY or 
CC0. Calling anything else "Open Access" is simply giving huge political 
support to the STM publishing industry and preventing scientists using modern 
tools.

P.




--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069<tel:%2B44-1223-763069>
_______________________________________________
GOAL mailing list
GOAL@eprints.org<mailto:GOAL@eprints.org>
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal

_______________________________________________
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal

Reply via email to