Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?

2009-12-10 Thread Kai Krueger
Hi,
...

 I guess numbers 3) and 4) are problematic. I know they do much more
 than the current feature set of OSMdoc but I implemented them because
 those were two fairly often requested features. I guess their priority
 just dropped a lot :)

 So I guess my current solution would be to offer access to the
 aggregated statistical data and then offer two views into the other
 data. One for ODbL data and one for CC by-SA data. How would those two
 datasets have to be separated?
 Or to phrase it in another way: When can I combine ODbL data with CC
 by-SA data, how do I do that and what do I have to do to comply to
 (both) licenses?

Independent of the fact that this remains an interesting discussion 
about the effects of both the licenses, I am wondering if some of the 
practical issues here for OSMdoc come from a different understanding of 
what is going to happen with the changeover than what I thought was planned.

My understanding is the following (assuming everything goes ahead): At 
some point there will be a cut off when the actual switch happens. At 
that point there will be a planet dump with full history of everything 
in OSM licensed under CC-BY-SA. Afterwards, all the data that can not be 
relicensed will be taken out of the main OSM database after which, again 
a dump of the planet including full history of everything that remains 
will be produced and now licensed under the ODbL.

It is not that the history data will only be available under CC-BY-SA 
and that ODbL only covers current and future data.

So the entire history of all then available OSM data will be licensed 
under the ODbL and can be used by OSMdoc. The only history that is 
missing from the ODbL dump is of that that has been removed from the 
ODbL version of OSM. So if that amount of data was deemed sufficiently 
little as that it won't harm OSM as a project and thus the relicensing 
can go ahead, I would have thought that that is also sufficiently little 
to not effect OSMdoc particularly either and can be dropped from OSMdoc 
statistics. At that point you have no worries about mixing the two 
licenses anymore.

Not sure if that was in anyway unclear to start with, but I thought it 
was worth pointing out, just in case.

Kai


 Cheers,
 Lars

 [1] http://www.openstreetmap.org/browse/way/45724946




___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?

2009-12-09 Thread Richard Fairhurst

Lars Francke wrote:
 At the moment I'm displaying statistical data about a snapshot of
 the OSM data. If it'd stay that way it would be very easy for me to
 switch from one license to the other as the data wouldn't depend on
 data from the CC by-SA set. But I'm currently rewriting the tool to
 account for historical statistics. One example would be a feature
 that has been requested quite often: How many users have used a tag.
 This means I have to incorporate the history of all elements into my
 numbers. I wouldn't want to lose the data if we switch but the number
 is clearly derived from both databases (the ODbL database and the CC
 by-SA dump). This is only one example. The new version uses historic
 data all over the place and I've been working very hard these last
 few weeks/months to get this far and to get the data so I wouldn't
 want to throw everything pre ODbL away as it would alter the
 meaning of the statistics.

Hooee, this is probably the single trickiest question I've seen.

I can't find any precedents for whether aggregate statistics such as
OSMdoc's are considered derivative works. My gut feeling, and no more than
that, is that they probably aren't. You are not really deriving any
information that's in the OSM database and offering it up for reuse.

The only meaningful content that you're reproducing is the actual text of
the tags and values - yet even then, these are divorced from the objects to
which they apply.

If your statistics aren't a derivative work, they don't inherit the
share-alike provisions of either licence, so no conflict arises.

It isn't black and white, of course. If you extract a list of Most popular
tags in the UK, ordered by popularity, I doubt it's derivative. If you
extract a list of Most popular values for the name= tag applied to streets
in Charlbury, ordered by popularity, it would be. There's clearly a
spectrum.

I believe OSMF has received legal advice that community guidelines
(informal Terms of Use) can help to influence edge cases such as this. I
would therefore suggest that, as a community supported by OSMF, we add an
explicit clarification to the relevant pages on the wiki that we do not
consider aggregate statistics of this sort to form a derivative work.

cheers
Richard
-- 
View this message in context: 
http://old.nabble.com/Implications-of-using-aggregated-statistical-data-from-both-licenses-%28ODbL-and-CC-by-SA%29-for-OSMdoc--tp26703374p26708988.html
Sent from the OpenStreetMap - General mailing list archive at Nabble.com.


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?

2009-12-09 Thread Richard Weait
On Wed, Dec 9, 2009 at 6:47 AM, Richard Fairhurst rich...@systemed.net wrote:

 Lars Francke wrote:
 At the moment I'm displaying statistical data about a snapshot of
 the OSM data.

 Hooee, this is probably the single trickiest question I've seen.

 I can't find any precedents for whether aggregate statistics such as
 OSMdoc's are considered derivative works. My gut feeling, and no more than
 that, is that they probably aren't. You are not really deriving any
 information that's in the OSM database and offering it up for reuse.

Lars' use case is a produced work[*].  How do we make this clear to
the community and clearly permitted in the license?

Our draft community guidelines on Produced Work[1] say:

If it was intended for the extraction of the original data, then it
is a database and not a Produced Work. Otherwise it is a Produced
Work. 

Lars' Produced Work is a database, but his database is a list of keys
and values with use totals.  The Produced Work drops the geo location
data and the connectivity data from OSM.  This irreversibly prevents
recreating or extracting the original data.  So by the community
guideline above OSMDoc is a Produced Work.

Best regards,
Richard

[*] I've made some assumptions about how Lars makes OSMDoc.
Corrections welcome.
[1] 
http://wiki.openstreetmap.org/wiki/Open_Data_License/Produced_Work_-_Guideline

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?

2009-12-09 Thread Lars Francke
On Wed, Dec 9, 2009 at 15:43, Richard Weait rich...@weait.com wrote:
 On Wed, Dec 9, 2009 at 6:47 AM, Richard Fairhurst rich...@systemed.net 
 wrote:

 Lars Francke wrote:
 At the moment I'm displaying statistical data about a snapshot of
 the OSM data.

 Hooee, this is probably the single trickiest question I've seen.

 I can't find any precedents for whether aggregate statistics such as
 OSMdoc's are considered derivative works. My gut feeling, and no more than
 that, is that they probably aren't. You are not really deriving any
 information that's in the OSM database and offering it up for reuse.

 Lars' use case is a produced work[*].  How do we make this clear to
 the community and clearly permitted in the license?

Thanks for the answer!
I think it'd be great if something like this could be added to the Use
cases page (or other documents).

If you need any more details on what I'm doing I'll gladly provide the
information. I could make the source code available but that'll
probably do more harm than good at this time.

 Lars' Produced Work is a database, but his database is a list of keys
 and values with use totals.  The Produced Work drops the geo location
 data and the connectivity data from OSM.  This irreversibly prevents
 recreating or extracting the original data.  So by the community
 guideline above OSMDoc is a Produced Work.

I don't drop all the geo data and I don't drop the connectivity data.
In fact I'm producing more of this data than there was in the original
database.

Four examples:
1) For every key and value I record which elements this key and value
are _currently_ used on.

2) Every key and value has a bounding box so one can see where the tag
has been used.

3) Way 1 in version 1 has four nodes. Two of those nodes are moved.
The original OSM data doesn't reflect this move in a new version of
the way. I record those minor version changes in nodes, ways and
relations and make them available on a similar page to OpenStreetMaps
current interface [1].

4) I have an experimental historical API that answers timestamp and
timerange queries: How did London look on 23.12.2006 at 12:33? Show me
all changes for Hamburg in the year 2007.

1) and 2) don't pose a problem, I hope. For 1) I can just drop all
information about pre ODbL information as this information is useless
anyway (those referenced elements can't be retrieved via the main API)
but 2) is geo data aggregated from both data sets.

I guess numbers 3) and 4) are problematic. I know they do much more
than the current feature set of OSMdoc but I implemented them because
those were two fairly often requested features. I guess their priority
just dropped a lot :)

So I guess my current solution would be to offer access to the
aggregated statistical data and then offer two views into the other
data. One for ODbL data and one for CC by-SA data. How would those two
datasets have to be separated?
Or to phrase it in another way: When can I combine ODbL data with CC
by-SA data, how do I do that and what do I have to do to comply to
(both) licenses?

Cheers,
Lars

[1] http://www.openstreetmap.org/browse/way/45724946

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Implications of using aggregated/statistical data from both licenses (ODbL and CC-by-SA) for OSMdoc?

2009-12-08 Thread Lars Francke
I've just listened to the podcast, I've read a lot of the mails on the
mailing lists in the last few days, I've read quite a few discussions
about it on IRC, the proposal document, parts of the license, the
human readable form of the license and a lot of the Wiki pages.

The more I read the less I know. So I'd like to take the LWG up on the
offer (the one I just heard in the Podcast) and ask what the
implications would be for me as the developer of OSMdoc if OSM were to
switch to ODbL (I'm assuming that at least parts of the OSM data would
have to be made unavailable from the ODbL dataset after the switch).

At the moment I'm displaying statistical data about a snapshot of the
OSM data. If it'd stay that way it would be very easy for me to switch
from one license to the other as the data wouldn't depend on data from
the CC by-SA set. But I'm currently rewriting the tool to account for
historical statistics. One example would be a feature that has been
requested quite often: How many users have used a tag. This means I
have to incorporate the history of all elements into my numbers. I
wouldn't want to lose the data if we switch but the number is clearly
derived from both databases (the ODbL database and the CC by-SA dump).
This is only one example. The new version uses historic data all over
the place and I've been working very hard these last few weeks/months
to get this far and to get the data so I wouldn't want to throw
everything pre ODbL away as it would alter the meaning of the
statistics.

What would the license change mean for me? What do I have to do to comply?

I don't even know which of these categories I belong to (taken from
the ODbL text):
“Collective Database” – Means this Database in unmodified form as part
of a collection of independent databases in themselves that together
are assembled into a collective whole. A work that constitutes a
Collective Database will not be considered a Derivative Database.
“Produced Work” – a work (such as an image, audiovisual material,
text, or sounds) resulting from using the whole or a Substantial part
of the Contents (via a search or other query) from this Database, a
Derivative Database, or this Database as part of a Collective
Database.
“Derivative Database” – Means a database based upon the Database, and
includes any translation, adaptation, arrangement, modification, or
any other alteration of the Database or of a Substantial part of the
Contents. This includes, but is not limited to, Extracting or
Re-utilising the whole or a Substantial part of the Contents in a new
Database.

1) Collective Database?
What does modify mean?
Again from the ODbL: “Database” – A collection of material (the
Contents) arranged in a systematic or methodical way and individually
accessible by electronic or other means offered under the terms of
this License.

I don't change any of the content of the database. I just parse the
provided XML and write the content into my own database (but I parse
the timestamp strings to longs, leave out usernames, etc. -
modification?).

2) Produced Work? Certainly. At least I think sobut...I don't
know. I provide a viewable version of the derivative database I
produced and in all probabilty there will be charts/graphs/etc. based
on this database.

3) Derivative Database? I think so.

As a short personal opinion about the license debate I'd like to add
that I've pretty much given up on understanding the license (and its
implications) despite the continued efforts by all those involved.
Please understand that this is not criticism about ODbL, CC by-SA, the
LWG or anyone else involved in this license change. I know that a lot
of people are working hard on this (on the Yes and on the No
side). But I have the feeling that the normal user can't really
understand or follow the details of the discussion anymore. This is
even more true for those of us that don't speak english as a native
language.

I know that mine is probably an uncommon case but I couldn't find
anything on the Use Cases site that deals with the combination of CC
by-SA and ODbL data that'd be applicable to my use case. So any help
or insights would be greatly appreciated.

Cheers,
Lars

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk