On 11 July 2014 03:52, Alex Barth <a...@mapbox.com> wrote: > I just updated the Wiki with a proposed community guideline on geocoding. > > Please review: > https://wiki.openstreetmap.org/wiki/Open_Data_License/Geocoding_-_Guideline
The whole point of the share-alike aspect of our licence is to stop people taking OSM, using it to 'improve' their own proprietary geo-data, and then not sharing the results back with the community. Whether you are for or against share-alike, that's what the community has decided to adopt, and that is what is required by the licences under which various source data has been used. So I think share-alike is here to stay. The loop-hole for produced works is to allow people to take proprietary style instructions / algorithms and run the geo-data through them to create something artistic, which they then don't have to share. But the ODbL ensures that the underlying geo-data (and any additions / modifications made to it) does have to be shared. So the way I see it, if there's any (substantial) addition of external geo-data along the way, then that addition creates a derivative database, before the produced work is created. So if you want to publicly use this database (or any produced work based on it) then either the derivative database must be shared-alike, or the algorithm used to produce it and any additional input data must be shared. In the case of any substanitial amount of geocoding, you are clearly having to add additional geographic data to the OSM data in order to do the geocoding. I would therefore argue that the result must be seen as a derivative database, and not as a produced work. (In fact I'd go slightly further, and say that in order to do the geocoding, you have to create a derivative database comprising the relevant data from OSM and the relevant address data you want to match against. You then run a query on that derivative database to produce your geocoded results.) And I think treating substantial amounts of geocoded results as a derivative database is certainly within the spirit of the licence, and something we would want share-alike to apply to. If people are enriching their own geographic data using OSM, we would like to be able to use their data to help improve OSM. I don't see why the specific case of geocoding should be any different to other uses of OSM data with data companies would like to keep private. In any case, even without these arguments, I think it would be impossible to argue that a "substantial" database of geocoded data that's been generated using OSM data is anything but a derivative database of OSM. So I don't think there's any getting around share-alike if your geocoded results are "substantial". So for those wanting to geocode proprietary datasets using OSM, I think there are three main options: 1/ Make sure your geocoding only amounts to insubstantial use of OSM. Then share-alike never kicks in, and it's irrelevant whether the results are produced works or derivative databases. 2/ Make sure your geocoded database (and any produced works or derivatives therefore) is kept internal to the company. Hence it is never "publicly used", and share-alike does not apply. 3/ Release the smallest possible derivatve database under the ODbL. As far as I can see this would need to include whatever input data is necessary for you to do the geocoding, as you need to include the address data and the OSM data in the same derivative databse in order to run your query on them to do the geocoding. As an example for 3, if you have a databse of company offices with addresses and other meta-data that you want to obtain lat/lons for, you'd need to release the address data and the lat-lons you've obtained. You needn't release the other meta-data, since that could be kept independently as part of a collective databse, with the two linked by some unique ID field. As an aside, I've yet to actually read a use case where this interpretation would be particularly problematic for a third party (at least no more so than any other proprietary data vs share-alike use case). The only thing I've seen where it might cause difficulties would be where individual user privacy is at risk. But I would have thought that either the users can keep their locations private (so not publicly used) or the locations can be linked to the private meta-data via an opaque key with the meta-data kept private in another part of a collective database. A company could always additionally geocode fictitious points to hide individual users to further increase privacy if they wanted. Given what I've written above, my view on https://wiki.openstreetmap.org/wiki/Open_Data_License/Geocoding_-_Guideline is that those proposing it need to go back to the drawing board. At the very least, it needs to start off with proper definitions / explanations of "geocoding", and details included in each example to say whether or not they include "substantial" use of OSM. It also needs to acknowledge that databases of substantial geocoded data will be derivative databases under ODbL, regarless of whether an individual result may be a produced work. Finally, the examples also need to be much more detailed, to explain exactly what input data is used in what way, and what the form of the outputs is. Hope that helps, Robert. -- Robert Whittaker _______________________________________________ legal-talk mailing list legal-talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/legal-talk