Question and challenge Question: Didn't the UK recently change its legislation explicitly to allow for data and text mining?
Challenge: My research blog and data verses are both fully open with no CC license at all. They are All Rights Reserved, and yet posted on the web, in the case of the dataverse deliberately so that people can go ahead and download and manipulate the data. I challenge anyone to go ahead and try some text and data mining. If you think there are legalities preventing you from doing this, please explain what they are. Blog: sustainingknowledgecommons.org<http://sustainingknowledgecommons.org> OA APCs: http://dataverse.scholarsportal.info/dvn/dv/oaapc/ There probably are some barriers to text and data mining, however these have nothing to do with legalities. For example, this morning I was looking for Walt Crawford's comment on one of my posts. This didn't come up, but that's likely just because Wordpress is not set up to search comments. I think we need to understand what barriers exist to data and text mining and resolve them, rather than assuming that pushing everyone to make their work CC-BY is the answer. For example, if my blog were CC-BY licensed, this wouldn't help with Wordpress not being set up to search the comments. Another example: there is nothing to stop the Licensor (as opposed to the downstream user) to put TPMs in a CC-BY or CC-0 work that would effectively prevent people from data and text mining. If one is legally prevented from data and text mining works that are in the open, no doubt as a law-abiding citizen you're not using any internet search engine. In my field, metadata is far more critical than legalities. I am sure that this is the case for other researchers. If others are doing work on journals, please include the title and ISSN - especially the ISSN as the key piece of data to facilitate remix in this particular area. A dataset that is CC-BY or CC-0 without this information is of little to no use. This is the kind of discussion I think we need to have with respect to re-use. best, Heather On 2015-06-01, at 10:59 AM, Peter Murray-Rust <pm...@cam.ac.uk<mailto:pm...@cam.ac.uk>> wrote: We are now at the point where anything less than full BOAI-compliance is seriously holding science and medicine back. We must have immediate "free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software,..." We've just run a workshop in Edinburgh in the Neuroscience group who are, inter alia, looking at Systematic review of animal experiments. One senior post doc has spent the last year reading 30,000 papers (sic) - that's one every 3 minutes - classifying them into properly reported and badly reported tests. Our (Open) contentmine.org<http://contentmine.org/> Text and Data Mining software can do this in a few seconds per paper. But ONLY if we are legally allowed to do this; and the only licences that allow this explicitly are CC-BY or CC0. (I have spent a considerable time on the legal aspects). The main STM publishers are challenging the right to Mine Content and throwing money at lobbying MEPs and European Commission to have restrictive clauses added to potential leglislation. The primary defence against this in almost all countries is to have science and medicine published as BOAI-compliant CC-BY or CC0. Calling anything else "Open Access" is simply giving huge political support to the STM publishing industry and preventing scientists using modern tools. P. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069<tel:%2B44-1223-763069> _______________________________________________ GOAL mailing list GOAL@eprints.org<mailto:GOAL@eprints.org> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
_______________________________________________ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal