Hi all,

Some points re this discussion:

Helen wrote:
> 1.    CC-BY is not necessary for data and text-mining. Internet search 
> engines such as google and social media companies do extensive data and text 
> mining, and they do not limit themselves to CC-BY material. This is true even 
> in the EU, so is not prevented by the EU's support for copyright of data. To 
> illustrate: if data and text-mining is not permissible without CC-BY, then 
> Google must shut down, immediately.

This point is a bit weird. Firstly, just because Google is doing 
something and getting away with it, doesn't mean a lone academic can be 
confident of doing something similar and getting away with it. I was 
always amazed by how brazenly Youtube set up its service *before* making 
agreements with the major media companies, when I would have assumed 
they would have been sued out of existence.

Secondly, some sort of licensing IS generally necessary for data and 
text mining. Just because it's not CC doesn't mean it's not a licence. 
For example Google Books reuses content, on the basis of explicit 
agreements which were apparently made with deposit libraries and 
publishers (I don't know the detail of that one). Facebook uses explicit 
licensing that its users sign up to. Twitter does the same, and third 
parties who mine Twitter any more than a tiny amount have to agree to 
specific terms. Etc etc.

Some sort of enabling licence is clearly necessary, and of course for 
data-mining we wish for a licence that "pre-approves" our actions so 
that we don't have to conduct a million negotiations before we analyse 
an aggregated dataset.


Ross wrote:
> WRT to your point 2 "CC-BY is not sufficient for data and text-mining" (nor
> is *any* applicable licence AFAIK - I know of no licence that asserts that
> digital material must be made available in a readily machine-interpretable
> form in the licence)

Actually the GPL is a very good example. It is for software, and the GPL 
authors don't recommend it be used for texts, but it offers a 
delightfully clear requirement that "the preferred form of the work for 
making modifications" is made available. In the world of software, this 
is the source code, but if applied to data it's clear that it would 
militate against providing data tables as images.

When I first heard of CC licenses I was surprised that they didn't use 
some form of words like this. It doesn't seem to "care" whether 
downstream users get the perfect original or a low-quality JPEG. Since 
then, I've come to decide that this relatively slack aspect of CC 
licences was very good for cultural works and so forth.

But for the purposes of academic data reuse, perhaps this is the more 
pertinent part of Helen's criticism.

The Open Database Licence also appears to assert "that digital material 
must be made available in a readily machine-interpretable form" 
<http://opendatacommons.org/licenses/odbl/summary/> though I'm less 
familiar with that (see the "Keep open" part of the summary).

Best
Dan


P.S. One very minor additional point - Ross wrote:
> practically the SA clause means that other content that doesn't
> have that *exact* licence  (CC-BY-NC-SA) cannot be remixed with content
under this licence

Be careful: the way you phrased it is not quite true. You can combine 
CC-BY or CC-BY-NC content into a CC-BY-NC-SA work, for example. The 
resulting work must be CC-BY-NC-SA in that case.


-- 
Dan Stowell
Postdoctoral Research Assistant
Centre for Digital Music
Queen Mary, University of London
Mile End Road, London E1 4NS
http://www.elec.qmul.ac.uk/digitalmusic/people/dans.htm
http://www.mcld.co.uk/
_______________________________________________
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal

Reply via email to