Hi all, Some points re this discussion:
Helen wrote: > 1. CC-BY is not necessary for data and text-mining. Internet search > engines such as google and social media companies do extensive data and text > mining, and they do not limit themselves to CC-BY material. This is true even > in the EU, so is not prevented by the EU's support for copyright of data. To > illustrate: if data and text-mining is not permissible without CC-BY, then > Google must shut down, immediately. This point is a bit weird. Firstly, just because Google is doing something and getting away with it, doesn't mean a lone academic can be confident of doing something similar and getting away with it. I was always amazed by how brazenly Youtube set up its service *before* making agreements with the major media companies, when I would have assumed they would have been sued out of existence. Secondly, some sort of licensing IS generally necessary for data and text mining. Just because it's not CC doesn't mean it's not a licence. For example Google Books reuses content, on the basis of explicit agreements which were apparently made with deposit libraries and publishers (I don't know the detail of that one). Facebook uses explicit licensing that its users sign up to. Twitter does the same, and third parties who mine Twitter any more than a tiny amount have to agree to specific terms. Etc etc. Some sort of enabling licence is clearly necessary, and of course for data-mining we wish for a licence that "pre-approves" our actions so that we don't have to conduct a million negotiations before we analyse an aggregated dataset. Ross wrote: > WRT to your point 2 "CC-BY is not sufficient for data and text-mining" (nor > is *any* applicable licence AFAIK - I know of no licence that asserts that > digital material must be made available in a readily machine-interpretable > form in the licence) Actually the GPL is a very good example. It is for software, and the GPL authors don't recommend it be used for texts, but it offers a delightfully clear requirement that "the preferred form of the work for making modifications" is made available. In the world of software, this is the source code, but if applied to data it's clear that it would militate against providing data tables as images. When I first heard of CC licenses I was surprised that they didn't use some form of words like this. It doesn't seem to "care" whether downstream users get the perfect original or a low-quality JPEG. Since then, I've come to decide that this relatively slack aspect of CC licences was very good for cultural works and so forth. But for the purposes of academic data reuse, perhaps this is the more pertinent part of Helen's criticism. The Open Database Licence also appears to assert "that digital material must be made available in a readily machine-interpretable form" <http://opendatacommons.org/licenses/odbl/summary/> though I'm less familiar with that (see the "Keep open" part of the summary). Best Dan P.S. One very minor additional point - Ross wrote: > practically the SA clause means that other content that doesn't > have that *exact* licence (CC-BY-NC-SA) cannot be remixed with content under this licence Be careful: the way you phrased it is not quite true. You can combine CC-BY or CC-BY-NC content into a CC-BY-NC-SA work, for example. The resulting work must be CC-BY-NC-SA in that case. -- Dan Stowell Postdoctoral Research Assistant Centre for Digital Music Queen Mary, University of London Mile End Road, London E1 4NS http://www.elec.qmul.ac.uk/digitalmusic/people/dans.htm http://www.mcld.co.uk/ _______________________________________________ GOAL mailing list GOAL@eprints.org http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal