Hi Martin, On the first pass, I am not checking individual Q-ID numbers, mostly because the existing tooling is very poor for that, and the rate of error is very low. JOSM simply looks up the ID and adds it. BUT, once it is added, I do a query (OT) for the tags, and match them with the Wikidata query results, and check that the names and other tags match (in a spreadsheet), allowing me to quickly catch the very few non-matching or broken items. This method has shown much much more value than simply let people copy/paste IDs, as humans tend to quite a few mistakes - I saw incorrect language codes for Wikipedia links (probably typed by hand), or simply stale or non-existent WP links. None of the approaches are perfect, but I hope mine will result in a much higher quality and more completeness.
You are correct that sometimes Wikidata could interlink non-related articles (usually it gets fixed right away), or articles with the different scope (somewhat more common and permanent). In a rare case, that would mean Wikidata ID would be wrong or not specific enough, but that is very easy to catch on the second pass and correct, when the actual data is compared. The more common case is the one i mentioned before - when Wikipedia articles (all of the linked languages) are about multiple concepts (e.g. administrative and ceremonial district together), but there exists another Wikidata ID, not linked to any articles, just for the admin district. Which means the linked one is more for ceremonial, and should be fixed (usually by some additional searching). So yes, blindly adding tags would work fine for >99%, and will not be good for the other <1% (guessing). Yet, it would still be good to have that 1% because they allow much better further validation and correction, whereas having a Wikipedia link is just a string of text that is much harder to work with when cross-verifying with other sources. And BTW, Wikidata is far from perfect either - most of England frequently has incorrect admin tree-structure, and should also be fixed - something that this work will also help fix - win-win for everyone :) On Fri, Nov 25, 2016 at 6:24 PM Martin Koppenhoefer <dieterdre...@gmail.com> wrote: sent from a phone > Il giorno 25 nov 2016, alle ore 22:55, Yuri Astrakhan < yuriastrak...@gmail.com> ha scritto: > > . I am simply converting existing Wikipedia tag into the Wikidata tags, because there is always a 1 to 1 matching between them, you are checking individually and critically whether the osm objects fit to the wikidata object definitions, or are you just adding wikidata tags for wikipedia articles that are already linked from osm? Afaik many wikidata objects are linked to several wikipedia articles (because of wp articles being written in different languages). Using wikipedia quite a bit in 3 languages I have found that inconsistencies aren't that rare ("wrong" articles interlinked). Partly this is because wp articles in different languages are mostly not translations but are articles that have varying coverage and levels of detail and focus (i.e. a wikidata object that fits onto an English article does not necessarily fit on the German article that is linked to the English article). Some linked articles are also simply wrong. One example: In the field of geographic places and settlements it can occur that socio-geographic places and political territorial entities are either mixed in the same article or are split over different articles, and it might also differ between languages (some languages might have 1 article dealing with both, others might have 2 and more). Wikidata seems to have a preference for administrative entities (not sure, it is just a first impression) and related statements in all cases I have seen so fat (even when there's a different object that also deals with the administrative entity). Misguided wikipedia tags are not very frequent in osm, but they do occur of course. Blindly adding corresponding wikidata tags might make it look more consistent even if the tag is wrong, because both tags seem to confirm each other. cheers, Martin
_______________________________________________ talk mailing list talk@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk