Ian Davis wrote:
On Tue, Jun 23, 2009 at 11:11 PM, Kingsley Idehen
<kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
Ian Davis wrote:
Hi all,
On Tue, Jun 23, 2009 at 9:36 PM, Kingsley Idehen
<kide...@openlinksw.com <mailto:kide...@openlinksw.com>
<mailto:kide...@openlinksw.com
<mailto:kide...@openlinksw.com>>> wrote:
All,
As you may have noticed, AWS still haven't made the LOD
cloud data
sets -- that I submitted eons ago -- public. Basically, the
hold-up comes down to discomfort with the lack of license
clarity
re. some of the data sets.
Action items for all data set publishers:
1. Integrate your data set licensing into your data set
(for LOD I
would expect CC-BY-SA to be the norm)
Please do not use CC-BY-SA for LOD - it is not an appropriate
licence and it is making the problem worse. That licence uses
copyright which does not hold for factual information.
Please use an Open Data Commons license or CC-0
http://www.opendatacommons.org/licenses/
http://wiki.creativecommons.org/CC0
If your dataset contains copyrighted material too (e.g.
reviews) and you hold the rights over that content then you
should also apply a standard copyright licence. So for
completeness you need a licence for your data and one for your
content. If you use CC-0 you can apply it to both at the same
time. Obviously if you aren't the rightsholder (e.g. it is
scraped data/content from someone else) then you can't just
slap any licence you like on it - you have to abide by the
original rightsholder's wishes.
Personally I would try and select a public domain waiver or
dedication, not one that requires attributon. The reason can
be seen at
http://en.wikipedia.org/wiki/BSD_license#UC_Berkeley_advertising_clause
where stacking of attributions becomes a huge burden. Having
datasets require attribution will negate one of the linked
data web's greatest strengths: the simplicity of remixing and
reusing data.
Ian,
Using licensing to ensure the data providers URIs are always
preserved delivers low cost and implicit attribution. This is what
I believe CC-BY-SA delivers. There is nothing wrong with granular
attribution if compliance is low cost. Personally, I think we are
on the verge of an "Attribution Economy", and said economy will
encourage contributions from a plethora of high quality data
providers (esp. from the tradition media realm).
I don't think usage of a URI is enough for attribution because a URI
is not information bearing.
Of course I could dereference it and perhaps obtain some triples that
use it, but that URI does not denote those triples or that document.
An HTTP URI (as used re. Linked Data meme) carries implicit attribution
prowess by implicitly binding the thing it identifies to its metadata
(very data bearing). This is what makes this URI type so potent when
dealing with data publishing and data access.
There will be dozens or hundreds of other documents that use the same
URI and the owners of those datasets would like attribution for their
work. For example, I can make some unique assertions about you that
no-one else has and I would like those attributed to me - using your
URI would not provide that attribution.
But your URIs conveys your point of view. The important thing here is
that their is a route back to your data space; the place from which your
point of view originates.
If the pathways to the origins of data are obscured we are recreating
yesterday's economy (imho), one in which original creators of work as
easily dislocated by middlemen. An economy in which incentives for data
publishing are minimal for those who have invested time and money in
quality data curation and maintenance.
Anyway, each data set provider should pick the license that works
for them :-)
Yes I agree. The above paragraph was my personal preference, but I'd
like to convince others to think like me :)
Ditto :-)
Ian
--
Regards,
Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com