Alan Ruttenberg wrote:
On Tue, Jun 23, 2009 at 4:36 PM, Kingsley Idehen <kide...@openlinksw.com> wrote:
All,
As you may have noticed, AWS still haven't made the LOD cloud data sets --
that I submitted eons ago -- public. Basically, the hold-up comes down to
discomfort with the lack of license clarity re. some of the data sets.
Action items for all data set publishers:
1. Integrate your data set licensing into your data set (for LOD I would expect
CC-BY-SA to be the norm)
First off, I am not a lawyer, and neither I nor Science Commons give
legal advice. I can pass along the results of our research and policy
work in this space, and connect you with others at Science Commons if
need be.
Data is tricky, since it's not always clear whether copyright licenses
can be applied. Copyright law at its core applies when there is
"creative expression" and does not protect facts, which most data
arguably is. It's very difficult to discern where copyright protection
ends and when the data is naturally in the public domain, and so we do
not advocate applying a copyright license to data (CC-BY-SA being an
example of such).
Here are some links if you are interested in understanding more about
the problem.
http://sciencecommons.org/resources/faq/database-protocol/
http://sciencecommons.org/projects/publishing/open-access-data-protocol/
http://www.slideshare.net/kaythaney/sharing-scientific-data-legal-normative-and-social-issues
http://sciencecommons.org/wp-content/uploads/freedom-to-research.pdf
A further issue is that any *license* applied to data constrains the
ability to integrate it on a large scale because any requirement on
the licensee gets magnified as more and more data sources become
available, each with a separate requirement. Instead it is suggested
that providers effectively commit the data to the public domain. In
order to do that, Science Commons defined a protocol for implementing
open access data. It is intended that various license and public
domain dedications might follow this protocol, and there are two thus
far that we have certified as truly open.
The Public Domain Dedication and License
http://www.opendatacommons.org/licenses/pddl/
and
CC Zero
http://creativecommons.org/publicdomain/zero/1.0/
We recommend that you use one of these approaches when releasing your
data, to ensure maximum freedom to integrate.
Alan,
Which license simply allows me to assert that I want to be attributed by
data source URI. Example (using DBpedia even though it isn't currently
CC-BY-SA):
I have the URI: <http://dbpedia.org/resource/Linked_Data>. If you use
this URI as a data source in a Linked Data meshup of Web 2.0 mashup, I
would like any user agent to be able to discover
<http://dbpedia.org/resource/Linked_Data>. Thus, "Data provided by
DBpedia" isn't good enough because the path to the actual data source
isn't reflected in the literal and generic attribution.
The point above is the crux of the matter for traditional media
companies (today) and smaller curators of high quality data (in the near
future). Nobody wants to invest time in making high quality data spaces
that are easily usurped by crawling and reconstitution via completely
different URIs that dislocate the originals; or even worse, produce
pretty presentations that complete obscure paths to original data
provider (what you see in a lot of Ajax and RIA style apps today).
Kingsley
With regards,
Alan Ruttenberg
http://sciencecommons.org/about/whoweare/ruttenberg/
2. Indicate license terms in the appropriate column at:
http://esw.w3.org/topic/DataSetRDFDumps
If licenses aren't clear I will have to exclude offending data sets from the
AWS publication effort.
--
Regards,
Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO OpenLink Software Web: http://www.openlinksw.com
--
Regards,
Kingsley Idehen Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO
OpenLink Software Web: http://www.openlinksw.com