> > > 1. CC-BY is not necessary for data and text-mining. >
In some sense true, it is not *strictly* necessary - but it sure does alleviate concerns over being sued! Google can 'get away with it' because they don't need to document the in-between steps - transparency. Researchers and academics *do* need to be able to display reproducible literature mining techniques and thus will need to reproduce some published content (in my understanding) in order to show that their methods work as described. Thus there is an easily explainable difference between Google's needs (no need for transparency, just present the results of the mining analyses without republishing the analysed content), and the needs of academic research (reproducibility/transparency demonstrated by reproducing some annotated/analysed content AND results). I'm sure there are other reasons too but AFAIK CC-BY is 'best' for mining (well, CC0 would be better, but that's not realistic for OA) As you well know other licences like CC-BY-NC leave one uncomfortably open to legal action if one posts such material on say, an ad-supported blog. I do not believe Open Access should prevent the sharing of materials on blogs and other popular places/uses and thus CC-BY is the 'safest' licence from the re-user POV. Digital content placed publicly on the internet needs *a* licence, and for OA research works; CC-BY looks like the best of those available to me. You are free to suggest an alternate licence and I think it would help your argument if you actually did, rather than just criticizing one option and seemingly providing no alternative. > 2. CC-BY is not sufficient for data and text-mining. The Creative Commons > licenses are designed as a means for creators to waive rights that they > would otherwise have under copyright; they do not place any obligations on > the Licensor. There is nothing to stop a creator from using a CC-BY license > with a locked-down PDF with extra DRM designed to prevent data and > text-mining. > > I also see the problem described here. But licencing and CC-BY has nothing to do with this problem! The problem described here, in my words is: obfuscation. This kind of thing is commonly encountered when publishers publish non-machine interpretable tables of data as *images* in academic works rather than copy-pasteable numbers or data as they should do. It doesn't matter what the licence is, CC-BY or even All Rights Reserved(!) - it's very difficult to mine usable correct information out of such tables/content. As a further example, they could provide all the text as a 'screenshot' style image to further hamper mining efforts. Thus I'm afraid point 2 bares no relevance to Open Access & CC-BY. Ross -- -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/- Ross Mounce PhD Student & Open Knowledge Foundation Panton Fellow Fossils, Phylogeny and Macroevolution Research Group University of Bath, 4 South Building, Lab 1.07 http://about.me/rossmounce -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
_______________________________________________ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal