On 19/05/2017 10:11, Christoph Lingg wrote:
  TagInfo states around 1 Million wikipedia/wikidata tags which is a great 
start.


(not directly related to the question, but relevant to the accuracy of the data links)

I'd certainly take some of those added tags with a pinch of salt. A number "place" objects near me have been linked to wikidata items by a well-meaning wikipedian, but unfortunately they don't actually match. What tends to happen is something like:

o OSM has a place object for a village and an admin entity

o An OSM user adds a wikipedia tag to the admin entity. The wikipedia entry describes itself as covering both the village and the admin entity, so that's OK.

o A wikipedian writes a bot that creates a wikidata item from the wikipedia article. The bot creates wikidata entries for villages, not admin entities. That's not entirely wrong, because the wikipedia article actually covers both.

o A different wikipedian spots that there is an OSM admin entity and a wikidata item with the same name in a similar location and links them via a wikidata tag. This results in the wrong OSM entity being linked to a wikidata item.

If you're consuming this data downstream you may want to add some processing that drops "dubious" links. How you calculate "dubious" is difficult, but you may be able to look at the OSM account that added the wikidata link and exclude those links added by a user who has added links worldwide (i.e. who clearly doesn't have local knowledge), or by a user who has added links unfeasibly quickly (no manual checking possible) or whose changeset comments have received a lot of discussion saying "that link you added is wrong". That last bit is most difficult because many changeset discussion comments are positive (e.g. "thanks for adding that wikidata link").

Best Regards,

Andy


_______________________________________________
dev mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/dev

Reply via email to