[OSM-talk] Fixing wikipedia/wikidata tags

2017-02-06 Thread Yuri Astrakhan
TLDR: researching ways to validate wikipedia and wikidata tags, wrote a
script to cross-check OSM and Wikidata, found many incorrect disambig
references, would love to start community discussion on best guidelines
going forward.


I have been analyzing the quality of OSM's wikipedia and wikidata tags by
cross-checking data using both OSM tags and Wikidata.  My first goal is to
fix "disambiguation" references - when OSM object links to the Wikipedia
disambiguation page, instead of the real location page. I have already
fixed about 200 objects, but there are about 800+ relations left, and I
could really use some help.  I don't think its possible to add them to
MapRoulette just yet.
https://www.mediawiki.org/wiki/User:Yurik/OSM_disambigs

While fixing wd/wp tagging issues, I have been putting together a list of
open questions on how we want to improve wikipedia and wikidata tags in
general, and create some guidelines. Lets discuss them in the talk page?
https://www.mediawiki.org/wiki/User:Yurik/Wikidata_OSM_questions

Lastly, if you have any suggestions on different ways to validate data
using the mixture of Wikidata and OSM, let me know.  At the moment I have a
list of all types of OSM objects' wikidata IDs, and mark the bad ones with
a value. If OSM's wikidata's "instance of" of one of the bad types, my
script puts those OSM objects it into a separate list that I can analyze.
The list of types is here - sort by the second column:
https://commons.wikimedia.org/wiki/Data:Sandbox/Yurik/OSM_object_instanceofs.tab
Feel free to modify the second value of any row to indicate that those
objects should be fixed.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing wikipedia/wikidata tags

2017-02-07 Thread Yuri Astrakhan
Oleksiy, we should continue doing the on-the-ground scouting, but I am not
talking about that, I am talking about tens of thousands of errors that OSM
already contains, and a way to find them. Having good cameras with GPS do
not help with this.

I already found 1000+ OSM objects with incorrect wikipedia tags. For
example, http://www.openstreetmap.org/relation/3715384 :
wikipedia = https://en.wikipedia.org/wiki/Oron -- a disambiguation page
that lists all the meanings of the word Oron
wikidata = https://www.wikidata.org/wiki/Q407935

Thanks to Wikidata tag, I can catch this errors. Wikidata item
shows instance of = Disambiguation page, meaning that the page is NOT about
the place in Nigeria.  I could of course simply delete the wrong tags, but
I would prefer to fix them one by one. Also, the disambiguation pages are
just the tip of the iceberg. There are many other types of imprecise
information, such as lists.

Lets discuss our tagging approaches and guidelines as listed in
https://www.mediawiki.org/wiki/User:Yurik/Wikidata_OSM_questions

On Tue, Feb 7, 2017 at 3:24 AM Oleksiy Muzalyev <oleksiy.muzal...@bluewin.ch>
wrote:

> Good morning Yuri,
>
> On Saturday I added the Wikidata tag to the monument [1] of Mikhail
> Bakunin [2] in Bern. In fact, I had added also the monument itself on the
> map. I searched for it for quite some time at Bremgartenfriedhof, as there
> was a typing error in the English Wikipedia article concerning the box
> number (it is corrected already).
>
> I also added some ground and aerial photos of the monument with GPS
> coordinates to the Wikimedia category, published the GPS trace to the OSM,
> and filmed a short video in English language:
>
> https://commons.wikimedia.org/wiki/File:Bakunin_Monument_Bern_EN.webm
> https://youtu.be/GCGdnFf8BDY
>
> and the same video in Russian:
>
> https://commons.wikimedia.org/wiki/File:Bakunin_Monument_Bern_RU.webm
> https://youtu.be/REjGTkJYKwU
>
> Quality on Youtube is better, as I could not figure out yet how to convert
> a video to the WEBM format without some quality loss.
>
> I mean that in addition to validating by scripts the legwork also have got
> a potential. In this respect, it would be helpful if we had the Wikipedia &
> Wikidata layer on the OSM map, with an option to see Wikidata items without
> an image, Wikipedia articles in different languages, so a human may see,
> analyze, and visit an object on the ground to clarify the situation. At the
> this point, I would not dare to correct an OSM-Wikipedia inconsistency
> without first visiting, recording a GPS trace, and filming it. So in my
> opinion it should be on a map, in addition to a list.
>
> Some new hardware tools became affordable by now: precise GPS/GLONASS
> trackers, video-cameras with stabilized gimbals for ground and aerial
> filming, directional microphones. But also the photo-cameras themselves
> became better. A human armed with these new tools can do a lot of useful
> work at a location, though it may take some time until we learn how to
> employ these tools effectively.
>
> [1] http://www.openstreetmap.org/node/4665613556#map=19/46.95039/7.42234
> [2] https://en.wikipedia.org/wiki/Mikhail_Bakunin
>
> With best regards,
> Oleksiy
>
>
> On 07.02.17 03:06, Yuri Astrakhan wrote:
>
> TLDR: researching ways to validate wikipedia and wikidata tags, wrote a
> script to cross-check OSM and Wikidata, found many incorrect disambig
> references, would love to start community discussion on best guidelines
> going forward.
>
>
> I have been analyzing the quality of OSM's wikipedia and wikidata tags by
> cross-checking data using both OSM tags and Wikidata.  My first goal is to
> fix "disambiguation" references - when OSM object links to the Wikipedia
> disambiguation page, instead of the real location page. I have already
> fixed about 200 objects, but there are about 800+ relations left, and I
> could really use some help.  I don't think its possible to add them to
> MapRoulette just yet.
> https://www.mediawiki.org/wiki/User:Yurik/OSM_disambigs
>
> While fixing wd/wp tagging issues, I have been putting together a list of
> open questions on how we want to improve wikipedia and wikidata tags in
> general, and create some guidelines. Lets discuss them in the talk page?
> https://www.mediawiki.org/wiki/User:Yurik/Wikidata_OSM_questions
>
> Lastly, if you have any suggestions on different ways to validate data
> using the mixture of Wikidata and OSM, let me know.  At the moment I have a
> list of all types of OSM objects' wikidata IDs, and mark the bad ones with
> a value. If OSM's wikidata's "instance of" of one of the bad types, my
> script puts those OSM objects it into a separate list that I can 

[OSM-talk] RU Wikipedia now uses OSM by Wikidata ID

2017-01-20 Thread Yuri Astrakhan
Russian Wikipedia just replaced all of their map links in the upper right
corner (geohack) with the  Kartographer extension!  Moreover, when
clicking the link, it also shows the location outline, if that object
exists in OpenStreetMap with a corresponding Wikidata ID (ways and
relations only, no nodes).  My deepest respect to my former Interactive
Team colleagues and volunteers who have made it possible!  (This was
community wishlist #21)

Example - city of Salzburg (click coordinates in the upper right corner, or
in the infobox on the side):
https://ru.wikipedia.org/wiki/%D0%97%D0%B0%D0%BB%D1%8C%D1%86%D0%B1%D1%83%D1%80%D0%B3

P.S. I am still working on improving Wikidata linking, and will be very
happy to collaborate with anyone on improving OSM data quality.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] RU Wikipedia now uses OSM by Wikidata ID

2017-01-23 Thread Yuri Astrakhan
Oleksiy, Stefano,

WMF is still working on improving the map style, but I'm sure they will
benefit if you create a specific comment that describes what should be
added, at what level, and what types of articles it would benefit. Please
file requests using this form (you can login using your Wikipedia user
account):
https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?tags=map-styles,maps

Please keep in mind that this is a "generic base" map - something useful
for many different types of Wikipedia articles. This map should be good for
overlays - so that article-specific information stands out. These
requirements are very different from the default OSM map, whose main goal
is to help OSM editors check what features are there or missing.

Lastly, Stefano, it would be great for WMF to add nearby articles too - it
exists in the Android Wikipedia app (possibly iPhone too), but not in
mobile or desktop. This functionality has been discussed, but hasn't been
implemented yet.

On Mon, Jan 23, 2017 at 5:52 AM Oleksiy Muzalyev <
oleksiy.muzal...@bluewin.ch> wrote:

> Yes. It works my end too. I got it now. I was clicking on the link "O"
> to display the OSM map with the Standard Layer, but not on the
> coordinates in an article.
>
> The map with the outline opens, however this map is very simplified, it
> does not render many useful for a traveler objects, such as museums,
> libraries, supermarkets,  universities, hospitals, etc., which are shown
> on the OSM map, MAPS.ME map, etc. What for did we map these objects? I
> do not see so far how I can practically use such a map.
>
> With best regards,
> Oleksiy
>
>
> On 23.01.2017 10:07, Max wrote:
> > Works for me (Firefox 53)
> >
> > On 2017년 01월 23일 07:31, Oleksiy Muzalyev wrote:
> >> Dear Yuri,
> >>
> >> Could you, please, provide an example with the location outline?
> >>
> >> I tried Salzburg, New York, Odessa, Moscow, etc. in the Russian
> >> Wikipedia, but I got always just a marker on the map, but not an
> >> outline, i.e. a line indicating the outer contours or boundaries of an
> >> object or figure. Perhaps, it is a thin line, and I do not notice it on
> >> the map? Or I misunderstood something.
> >>
> >> With best regards,
> >> Oleksiy
> >>
> >> On 21.01.17 02:40, Yuri Astrakhan wrote:
> >>> Russian Wikipedia just replaced all of their map links in the upper
> >>> right corner (geohack) with the  Kartographer extension!
> >>> Moreover, when clicking the link, it also shows the location outline,
> >>> if that object exists in OpenStreetMap with a corresponding Wikidata
> >>> ID (ways and relations only, no nodes).  My deepest respect to my
> >>> former Interactive Team colleagues and volunteers who have made it
> >>> possible!  (This was community wishlist #21)
> >>>
> >>> Example - city of Salzburg (click coordinates in the upper right
> >>> corner, or in the infobox on the side):
> >>>
> https://ru.wikipedia.org/wiki/%D0%97%D0%B0%D0%BB%D1%8C%D1%86%D0%B1%D1%83%D1%80%D0%B3
> >>>
> >>>
> >>> P.S. I am still working on improving Wikidata linking, and will be
> >>> very happy to collaborate with anyone on improving OSM data quality.
> >>>
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] RU Wikipedia now uses OSM by Wikidata ID

2017-01-23 Thread Yuri Astrakhan
Dave, I'm not sure what you mean.

On Mon, Jan 23, 2017 at 12:52 PM Dave F <davefoxfa...@btinternet.com> wrote:

Could the attribute be put on one line so 'Openstreetmap' is visible?

DaveF.


On 21/01/2017 01:40, Yuri Astrakhan wrote:

Russian Wikipedia just replaced all of their map links in the upper right
corner (geohack) with the  Kartographer extension!  Moreover, when
clicking the link, it also shows the location outline, if that object
exists in OpenStreetMap with a corresponding Wikidata ID (ways and
relations only, no nodes).  My deepest respect to my former Interactive
Team colleagues and volunteers who have made it possible!  (This was
community wishlist #21)

Example - city of Salzburg (click coordinates in the upper right corner, or
in the infobox on the side):
https://ru.wikipedia.org/wiki/%D0%97%D0%B0%D0%BB%D1%8C%D1%86%D0%B1%D1%83%D1%80%D0%B3

P.S. I am still working on improving Wikidata linking, and will be very
happy to collaborate with anyone on improving OSM data quality.


___
talk mailing 
listtalk@openstreetmap.orghttps://lists.openstreetmap.org/listinfo/talk




--
[image: Avast logo]
<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient>

This email has been checked for viruses by Avast antivirus software.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email_source=link_campaign=sig-email_content=emailclient>

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Semi-auto converting Wikipedia -> Wikidata tags

2016-11-25 Thread Yuri Astrakhan
Hi, I am exploring ways to make more educational maps in Wikipedia. For
example, this graph shows all US state governors. It works by querying
Wikidata for the governors' info, and drawing state overlays using OSM
relations tagged with the Wikidata IDs.

https://www.mediawiki.org/wiki/Help:Extension:Kartographer#GeoShapes_via_Wikidata_Query

This new technology should (hopefully) enhance location and politics
related articles. To work, this technology relies on the Wikidata-tagged
objects in OSM, so the more objects are tagged, the more interesting maps
can be created by the community. While the top level (countries, states)
are already tagged, the smaller areas tend to have just the Wikipedia tag.
I have been adding the matching Wikidata tag for many admin-level relations
by using JOSM's "Fetch Wikidata ID" command (Wikipedia plugin).  This works
great most of the time, but on occasion it is not perfect. For example, in
England there are Administrative and Ceremonial (historical) parishes. Both
would be tagged with the same Wikipedia tag because both concepts are
described in the same article, yet the matching Wikidata ID would usually
cover just one aspect (usually ceremonial), but not the admin.  I plan to
do the following:

* Going from admin_level 1..10+, for all locations that have Wikipedia tag
but not Wikidata tag, add the matching Wikidata IDs using Wikipedia
plugin's "fetch Wikidata ID" command. At the moment, Wikipedia plugin does
not automatically resolve Wikipedia page redirects (if a page was renamed),
so I often have to do it by hand.
* Once all areas are marked, I would like to ensure that Wikidata and OSM
are in sync, by checking that Wikidata tags are actually pointing to admin
areas, and that the tree structure in OSM and in Wikidata match. E.g. this
query shows the tree structure of Wikidata. If anyone has any CC0 sources
of the admin structure of the countries, please msg me.

https://www.wikidata.org/w/index.php?title=User:Yurik/Admin_regions

To clarify - I am NOT adding wikidata IDs by some magical GPS coordinate
resolution or name matching.  I am simply converting existing Wikipedia tag
into the Wikidata tags, because there is always a 1 to 1 matching between
them, and adding a Wikidata tag ensures that even if the WP article is
renamed or deleted, at least Wikidata tag stays valid.  Adding WD tag that
describes ceremonial parish rather than admin district is "incrementally
beneficial", in the sense that it is still relevant - it points to the
right Wikipedia article, and it also makes it easier to further improve it
to point to the admin district via a semi-automated (spreadsheet/text
checks) validation, or checking for dups.

Thanks!
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-auto converting Wikipedia -> Wikidata tags

2016-11-25 Thread Yuri Astrakhan
Hi Martin,

On the first pass, I am not checking individual Q-ID numbers, mostly
because the existing tooling is very poor for that, and the rate of error
is very low. JOSM simply looks up the ID and adds it. BUT, once it is
added, I do a query (OT) for the tags, and match them with the Wikidata
query results, and check that the names and other tags match (in a
spreadsheet), allowing me to quickly catch the very few non-matching or
broken items.  This method has shown much much more value than simply let
people copy/paste IDs, as humans tend to quite a few mistakes - I saw
incorrect language codes for Wikipedia links (probably typed by hand), or
simply stale or non-existent WP links.  None of the approaches are perfect,
but I hope mine will result in a much higher quality and more completeness.

You are correct that sometimes Wikidata could interlink non-related
articles (usually it gets fixed right away), or articles with the different
scope (somewhat more common and permanent). In a rare case, that would mean
Wikidata ID would be wrong or not specific enough, but that is very easy to
catch on the second pass and correct, when the actual data is compared.
The more common case is the one i mentioned before - when Wikipedia
articles (all of the linked languages) are about multiple concepts (e.g.
administrative and ceremonial district together), but there exists another
Wikidata ID, not linked to any articles, just for the admin district. Which
means the linked one is more for ceremonial, and should be fixed (usually
by some additional searching).

So yes, blindly adding tags would work fine for >99%, and will not be good
for the other <1% (guessing). Yet, it would still be good to have that 1%
because they allow much better further validation and correction, whereas
having a Wikipedia link is just a string of text that is much harder to
work with when cross-verifying with other sources.  And BTW, Wikidata is
far from perfect either - most of England frequently has incorrect admin
tree-structure, and should also be fixed - something that this work will
also help fix - win-win for everyone :)

On Fri, Nov 25, 2016 at 6:24 PM Martin Koppenhoefer <dieterdre...@gmail.com>
wrote:



sent from a phone

> Il giorno 25 nov 2016, alle ore 22:55, Yuri Astrakhan <
yuriastrak...@gmail.com> ha scritto:
>
> .  I am simply converting existing Wikipedia tag into the Wikidata tags,
because there is always a 1 to 1 matching between them,


you are checking individually and critically whether the osm objects fit to
the wikidata object definitions, or are you just adding wikidata tags for
wikipedia articles that are already linked from osm?

Afaik many wikidata objects are linked to several wikipedia articles
(because of wp articles being written in different languages). Using
wikipedia quite a bit in 3 languages I have found that inconsistencies
aren't that rare ("wrong" articles interlinked). Partly this is because wp
articles in different languages are mostly not translations but are
articles that have varying coverage and levels of detail and focus (i.e. a
wikidata object that fits onto an English article does not necessarily fit
on the German article that is linked to the English article). Some linked
articles are also simply wrong.

One example: In the field of geographic places and settlements it can occur
that socio-geographic places and political territorial entities are either
mixed in the same article or are split over different articles, and it
might also differ between languages (some languages might have 1 article
dealing with both, others might have 2 and more). Wikidata seems to have a
preference for administrative entities (not sure, it is just a first
impression) and related statements in all cases I have seen so fat (even
when there's a different object that also deals with the administrative
entity).

Misguided wikipedia tags are not very frequent in osm, but they do occur of
course. Blindly adding corresponding wikidata tags might make it look more
consistent even if the tag is wrong, because both tags seem to confirm each
other.

cheers,
Martin
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Wikipedia/Wikidata admins cleanup

2017-01-03 Thread Yuri Astrakhan
I have been steadily cleaning up some (many) broken Wikipedia and Wikidata
tags, and would like to solicit some help :)

To my knowledge, there is no site where one could add a set of OSM IDs that
need attention (something like a bug tracker lite, where one could come and
randomly pick a few IDs to fix), so I made a few tables:

List of Wikipedia tags that do not resolve to Wikidata tags. Most of the
time, the WP tag is incorrect, sometimes it was deleted, and very rarely
there is no matching Wikidata item (needs to be created by hand).
* https://www.mediawiki.org/wiki/User:Yurik/OSM_NoWD

List of duplicate Wikidata tags:
* https://www.mediawiki.org/wiki/User:Yurik/OSM_duplicates2

Thanks!
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Wikipedia/Wikidata admins cleanup

2017-01-04 Thread Yuri Astrakhan
* I think MapRoulette is actually the tool we should use to fix these
issues. I am not yet sure how to build an OT query that gets relations for
the challenge, but this approach should automate the whole process. Any
ideas?
https://github.com/maproulette/maproulette2/issues/259

* Wikidata tags are already being auto-added. Adding Wikipedia tag in iD
editor automatically inserts the matching Wikidata tag - and I seriously
doubt there are many editors who verify Wikidata ID, or even notice that it
got auto-added. Relying on the "trickling effect" to ensure quality seems
bad -- quality is much better ensured by automated rule based validations,
and either autofix if obvious, or add them to MapRoulette for fixing by
humans.  Humans are a VERY expensive resource, lets use it only when
necessary.

* Wikidata ID does not have to be 1:1 with OSM.  A Wikidata ID can be
thought of as a more permanent ID for a Wikipedia title. So adding it
simply locks WP title in place, nothing more.  Yet, Wikidata could, in
theory, be more precise. Instead of Wikidata that describes "civil and
historical perish" (as a matching WP article), there could be a more
precise "civil only" wikidata item. Most of the time, there is no
additional item, and more work is needed to possibly create those in WD,
and to decide how to link between them. In the mean time, having an ID
pointing to the "mixed" item is ok - it already provides a huge benefit for
analyzing, cross-linking, and quality assessment. And the mixed item links
to the proper Wikipedia articles, which was the original goal.

* By far the biggest issue I discovered was numerous "auto-injected"
Wikipedia titles. It seems people mindlessly copied the object's title into
the wikipedia tag, without even checking if the article exists, or if it's
even about a place or something else. Adding Wikidata tag is a huge
advantage as we now can quickly evaluate the relevance (item needs to be
"instance of/subclass of" a location. This was never possible with just a
text title.  It would be amazing to clone/integrate Wikidata database into
Overpass-Turbo, allowing much more complex validations.

On Wed, Jan 4, 2017 at 11:12 AM Christoph Hormann 
wrote:

> On Wednesday 04 January 2017, Andy Mabbett wrote:
> > >
> > > Then you'd need to change the tag definition on the wiki to reflect
> > > that (and to explain what these circumstances are).
> >
> > You - and thus I - were talking about "Wikidata values", now you're
> > talking about a single tag. Which is it, please?
>
> I am sorry - OSM terminology can be somewhat ambigous here.  With tag
> definition i mean the definition of the meaning of a certain tag, i.e.
> what it is supposed to mean when an OSM feature has the tag wikidata=x.
> This should be documented on the OSM wiki in a way that reflects actual
> mapping practice and which is verifiable for mappers.
>
> With 'wikidata values' i was referring to the actual values that exist
> for the wikidata key in our database.
>
> > And where is the link I requested?
>
> I considered that a rhetorical question.  Taginfo gives you usage
> statistics for the values used:
>
> http://taginfo.openstreetmap.org/keys/?key=wikidata#values
>
> --
> Christoph Hormann
> http://www.imagico.de/
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Looking for "primary language" map

2017-04-10 Thread Yuri Astrakhan
I simply need to determine the most likely language of the "name" tag (not
the "name:xx" tag). Does not have to be 100% correct - even 80% is great.

On Mon, Apr 10, 2017 at 8:59 PM john whelan <jwhelan0...@gmail.com> wrote:

Orleans is part of Ottawa and all street names signs are bilingual or in
the process of being replaced by bilingual ones.  Certainly the street I
live on in Orleans has a bilingual street name sign.  The English French
question is very much political in Canada and I suspect much of the world.

Montreal has a quite large English speaking community which is rare in
Quebec.

You could try looking at the street names to see if they are in English and
have a second language name as well. name:fr for example.

Cheerio John

On 10 April 2017 at 20:47, James <james2...@gmail.com> wrote:

Well it might not be as simple as you say...take for instance Ottawa. It's
in Ontario and pretty english. There is a suburb called Orléans in which is
pretty much "the french part of town" as most street signs will be in
french, but rest of Ottawa is pretty English(in terms of street signs)

 So generilizing wont help you much...

On Apr 10, 2017 8:27 PM, "Yuri Astrakhan" <yuriastrak...@gmail.com> wrote:

Exactly, and that's the map I need -- a set of shapes that define these
region mapping: Quebec+New Brunswick => fr, the rest of USA/Canada => en,
...
The shapes may overlap because that would make geojson smaller - I will
simply use the first one.

Having this map will allow me to determine the likely language of the
"name" tag for any location, which in turn make for a better multilingual
map.

On Mon, Apr 10, 2017 at 8:20 PM James <james2...@gmail.com> wrote:

Well many countries have multiple official languages, Canada is French and
English, but in practice is mostly Quebec and New brunswick...with small
patches of french throughout the rest

On Apr 10, 2017 8:12 PM, "Yuri Astrakhan" <yuriastrak...@gmail.com> wrote:

James, thanks, but I was hoping for the language regions shapefile, e.g. in
the GeoJSON form.  The list of official languages will require a lot of
work to convert into the merged shapes, and it still not very good, as many
countries have several official languages, e.g. Switzerland.

On Mon, Apr 10, 2017 at 7:55 PM James <james2...@gmail.com> wrote:

Also have you checked:
https://en.wikipedia.org/wiki/List_of_official_languages_by_country_and_territory

On Apr 10, 2017 7:50 PM, "James" <james2...@gmail.com> wrote:

More like French for the entirety of the province of Quebec

On Apr 10, 2017 7:38 PM, "Yuri Astrakhan" <yuriastrak...@gmail.com> wrote:

Does anyone know of an open source language map - basically a set of
geoshapes with the corresponding language code?  Country boundaries are not
needed - e.g. Canada and USA would be English with the exception of French
for Montreal area.

This is needed to guesstimate what language the "name" tag is in.

Does not have to be very precise (10-20 MB is more than enough)

___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Looking for "primary language" map

2017-04-10 Thread Yuri Astrakhan
Exactly, and that's the map I need -- a set of shapes that define these
region mapping: Quebec+New Brunswick => fr, the rest of USA/Canada => en,
...
The shapes may overlap because that would make geojson smaller - I will
simply use the first one.

Having this map will allow me to determine the likely language of the
"name" tag for any location, which in turn make for a better multilingual
map.

On Mon, Apr 10, 2017 at 8:20 PM James <james2...@gmail.com> wrote:

> Well many countries have multiple official languages, Canada is French and
> English, but in practice is mostly Quebec and New brunswick...with small
> patches of french throughout the rest
>
> On Apr 10, 2017 8:12 PM, "Yuri Astrakhan" <yuriastrak...@gmail.com> wrote:
>
> James, thanks, but I was hoping for the language regions shapefile, e.g.
> in the GeoJSON form.  The list of official languages will require a lot of
> work to convert into the merged shapes, and it still not very good, as many
> countries have several official languages, e.g. Switzerland.
>
> On Mon, Apr 10, 2017 at 7:55 PM James <james2...@gmail.com> wrote:
>
> Also have you checked:
> https://en.wikipedia.org/wiki/List_of_official_languages_by_country_and_territory
>
> On Apr 10, 2017 7:50 PM, "James" <james2...@gmail.com> wrote:
>
> More like French for the entirety of the province of Quebec
>
> On Apr 10, 2017 7:38 PM, "Yuri Astrakhan" <yuriastrak...@gmail.com> wrote:
>
> Does anyone know of an open source language map - basically a set of
> geoshapes with the corresponding language code?  Country boundaries are not
> needed - e.g. Canada and USA would be English with the exception of French
> for Montreal area.
>
> This is needed to guesstimate what language the "name" tag is in.
>
> Does not have to be very precise (10-20 MB is more than enough)
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Looking for "primary language" map

2017-04-10 Thread Yuri Astrakhan
Does anyone know of an open source language map - basically a set of
geoshapes with the corresponding language code?  Country boundaries are not
needed - e.g. Canada and USA would be English with the exception of French
for Montreal area.

This is needed to guesstimate what language the "name" tag is in.

Does not have to be very precise (10-20 MB is more than enough)
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Looking for "primary language" map

2017-04-10 Thread Yuri Astrakhan
James, thanks, but I was hoping for the language regions shapefile, e.g. in
the GeoJSON form.  The list of official languages will require a lot of
work to convert into the merged shapes, and it still not very good, as many
countries have several official languages, e.g. Switzerland.

On Mon, Apr 10, 2017 at 7:55 PM James <james2...@gmail.com> wrote:

> Also have you checked:
> https://en.wikipedia.org/wiki/List_of_official_languages_by_country_and_territory
>
> On Apr 10, 2017 7:50 PM, "James" <james2...@gmail.com> wrote:
>
> More like French for the entirety of the province of Quebec
>
> On Apr 10, 2017 7:38 PM, "Yuri Astrakhan" <yuriastrak...@gmail.com> wrote:
>
> Does anyone know of an open source language map - basically a set of
> geoshapes with the corresponding language code?  Country boundaries are not
> needed - e.g. Canada and USA would be English with the exception of French
> for Montreal area.
>
> This is needed to guesstimate what language the "name" tag is in.
>
> Does not have to be very precise (10-20 MB is more than enough)
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSM Wikidata SPARQL service updated

2017-08-14 Thread Yuri Astrakhan
For relations, I simply add "members" in addition to "tags".  The members
are stored as 3 different predicates:  "osmm:has", "osmm:has:_", and
"osm:has:..." (where the ... represents any "valid" label - same rules as
for the tag key).  The object (value) of the statement is a link to another
OSM object. Copying from wiki:
# this relation contains a way with blank label
  osmm:has osmway:2345 ;
# this relation contains a node with a non-ascii label
  osmm:has:_ osmnode:1234 ;
# this relation contains a relation labelled as "inner"
  osmm:has:inner osmrel:4567 ;

See
https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service#How_OSM_data_is_stored

At this point there is no "osmm:loc" stored for relations. I wonder if it
is possible to calculate it dynamically using the geo functions built into
Wikidata's engine (Blazegraph + customizations). If not, I might need to do
some relations postprocessing, as well as automatic updating.

On Mon, Aug 14, 2017 at 1:39 PM, François Lacombe <fl.infosrese...@gmail.com
> wrote:

> Hi
>
> 2017-08-14 11:18 GMT+02:00 mmd <mmd@gmail.com>:
>
>> Hi,
>>
>> Am 13.08.2017 um 19:49 schrieb Yuri Astrakhan:
>>
>> > * all ways now store "osmm:loc" with centroid coordinates, making it
>> > possible to crudely filter ways by location
>>
>> out of curiosity, can you say a few words on how your overall approach
>> to calculate centroids for ways? As we all know it's an endless pain to
>> get that information out of minutely diffs :)
>>
>
> Out of curiosity, can you explain how relations are processed also ?
> Is this relevant to look for a representative point for a relation or a
> more complex search have to be done for specific members (roles) ?
>
> I don't know Shapely enough to say if it can handle this standalone and it
> would be greate to elaborate a bit what is done from line 260 to 280 of
> osm2rdf.py :)
>
>
> All the best
>
> François
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] OSM Wikidata SPARQL service updated

2017-08-13 Thread Yuri Astrakhan
The combined SPARQL database of OSM and Wikidata has been updated:
* There is a short video explaining the basics (at the top of [1])
* new Wikidata interface
* now all OSM "wikipedia" tags and sitelinks in Wikidata are stored the
same way, so it is possible to cross-check when "wikidata" and "wikipedia"
tags do match up (see example [1])
* all ways now store "osmm:loc" with centroid coordinates, making it
possible to crudely filter ways by location
* all tag keys that contain non-latin chars, spaces, etc, are now also
stored without values in "osmm:badkey"

P.S. All OSM data is up to date, refreshed every minute. The data from
Wikidata is about two days behind, still catching up.

[1]
https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service#Places_with_incorrect_Wikipedia.2Fwikidata_tags
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSM Wikidata SPARQL service updated

2017-08-14 Thread Yuri Astrakhan
mmd, the centroids are calculated with this code, let me know if there is a
better way, I wasn't aware of any issues with the minute data updates.
  wkb = wkbfab.create_linestring(obj)
  point = loads(wkb, hex=True).representative_point()
https://github.com/nyurik/osm2rdf/blob/master/osm2rdf.py#L250

Your query is correct, and you are right that (in theory) there shouldn't
be any ways without the center point. But there has been a number of ways
with only 1 point, causing a parsing error "need at least two points for
linestring". I will need to add some special handling for that
(suggestions?).

You can see the error by adding this line:
   OPTIONAL { ?osmId osmm:loc:error ?err . }
The whole query --  http://tinyurl.com/ydf4qd62  (you can create short urls
with a button on the left side)

On Mon, Aug 14, 2017 at 5:18 AM, mmd <mmd@gmail.com> wrote:

> Hi,
>
> Am 13.08.2017 um 19:49 schrieb Yuri Astrakhan:
>
> > * all ways now store "osmm:loc" with centroid coordinates, making it
> > possible to crudely filter ways by location
>
> out of curiosity, can you say a few words on how your overall approach
> to calculate centroids for ways? As we all know it's an endless pain to
> get that information out of minutely diffs :)
>
> I have to say that I'm pretty much unfamiliar with SPARQL and just tried
> the following query. My expectation was that I won't get any results,
> making me wonder if my query has some issue?
>
> SELECT * WHERE {
>   ?osmId osmm:type 'w' .
>   FILTER NOT EXISTS { ?osmId osmm:loc ?osmLoc }.
> } LIMIT 100
>
>
> BTW: A quick search on Github yielded the following:
> https://github.com/nyurik/osm2rdf. Would that be the right place to look
> for more details?
>
> Best,
> mmd
>
>
> --
>
>
>
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSM Wikidata SPARQL service updated

2017-08-20 Thread Yuri Astrakhan
Sarah, how would I set the node cache file to the repserv.apply_diffs()?
The idx param is passed to the apply_file() - for the initial PBF dump
parsing, but I don't see any place to pass it for the subsequent diff
processing.  I assume there must be a way to run .apply_diff() that will
download the minute diff file, update node cache file with the changed
nodes, and afterwards call my way handler with the updated way geometries.

Also, I assume you meant dense_file_array, not dense_file_cache. So in my
case I would use one of these idx values when calculating way centroid, and
None otherwise:
sparse_mem_array
dense_mmap_array
sparse_file_array,my_cache_file
dense_file_array,my_cache_file

Thanks!


On Mon, Aug 14, 2017 at 4:31 PM, Sarah Hoffmann <lon...@denofr.de> wrote:

> On Mon, Aug 14, 2017 at 11:10:39AM -0400, Yuri Astrakhan wrote:
> > mmd, the centroids are calculated with this code, let me know if there
> is a
> > better way, I wasn't aware of any issues with the minute data updates.
> >   wkb = wkbfab.create_linestring(obj)
> >   point = loads(wkb, hex=True).representative_point()
> > https://github.com/nyurik/osm2rdf/blob/master/osm2rdf.py#L250
>
> It doesn't look like you have any location cache included when
> processing updates, so that's unlikely to work.
>
> Minutely updates don't have the full node location information.
> If a way gets updated, you only get the new list of node ids.
> If the nodes have not changed themselves, they are not available
> with the update.
>
> If you need location information, you need to keep a persistent
> node cache in a file (idx=dense_file_cache,file.nodecache)
> and use that in your updates as well. It needs to be updated
> with the fresh node locations from the minutely change files
> and it is used to fill the coordinates for the ways.
>
> Once you have the node cache, you can get the geometries for
> updates ways. This is still only half the truth. If a node in
> a way is moved around, then this will naturally change the
> geometry of the way, but the minutely change file will have
> no indication that the way changed. Normally, these changes are
> relatively small and for some applications it is good enough
> to ignore them (Nominatim, the search engine, does so, for example).
> If you need to catch that case, then you also need to keep a
> persistent reverse index of which node is part of which way
> and for each changed node, update the ways it belongs to.
> There is currently no support for this in libosmium/pyosmium.
> So you would need to implement this yourself somehow.
>
> Kind regards
>
> Sarah
>
> >
> > Your query is correct, and you are right that (in theory) there shouldn't
> > be any ways without the center point. But there has been a number of ways
> > with only 1 point, causing a parsing error "need at least two points for
> > linestring". I will need to add some special handling for that
> > (suggestions?).
> >
> > You can see the error by adding this line:
> >OPTIONAL { ?osmId osmm:loc:error ?err . }
> > The whole query --  http://tinyurl.com/ydf4qd62  (you can create short
> urls
> > with a button on the left side)
> >
> > On Mon, Aug 14, 2017 at 5:18 AM, mmd <mmd@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Am 13.08.2017 um 19:49 schrieb Yuri Astrakhan:
> > >
> > > > * all ways now store "osmm:loc" with centroid coordinates, making it
> > > > possible to crudely filter ways by location
> > >
> > > out of curiosity, can you say a few words on how your overall approach
> > > to calculate centroids for ways? As we all know it's an endless pain to
> > > get that information out of minutely diffs :)
> > >
> > > I have to say that I'm pretty much unfamiliar with SPARQL and just
> tried
> > > the following query. My expectation was that I won't get any results,
> > > making me wonder if my query has some issue?
> > >
> > > SELECT * WHERE {
> > >   ?osmId osmm:type 'w' .
> > >   FILTER NOT EXISTS { ?osmId osmm:loc ?osmLoc }.
> > > } LIMIT 100
> > >
> > >
> > > BTW: A quick search on Github yielded the following:
> > > https://github.com/nyurik/osm2rdf. Would that be the right place to
> look
> > > for more details?
> > >
> > > Best,
> > > mmd
> > >
> > >
> > > --
> > >
> > >
> > >
> > >
> > >
> > > ___
> > > talk mailing list
> > > talk@openstreetmap.org
> > > https://lists.openstreetmap.org/listinfo/talk
> > >
>
> > ___
> > talk mailing list
> > talk@openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] new Wikidata+OSM data in one RDF database

2017-05-12 Thread Yuri Astrakhan
TLDR: A SPARQL (rdf) database with both OSM and Wikidata data is up for
testing.  Allows massive cross-referenced queries between two datasets. The
service is a test, and needs a permanent home to stay alive.

Overpass Turbo is awesome, but sadly it does not have data from Wikidata,
nor does it support some SQL-like conditions. I have setup a temporary RDF
database that has both OSM & Wikidata. You can use SPARQL queries to find:

* All OSM objects with wikidata tag that references a Wikipedia
disambiguation page. Get the name of the page in first available language
ru, fr, de, en.http://tinyurl.com/mzlfb26

* OSM relations with wikidata tag pointing to a person (also tries multiple
language fallbacks).  http://tinyurl.com/m6fh3wx

* OSM relations with duplicate Wikidata IDs http://tinyurl.com/mvhhogx


== OSM data structure ==
osmnode, osmway, osmrel - OSM object prefix, e.g.  osmnode:1234
osmt - tag, e.g.  osmt:name:en  (only has tags with latin chars, -, _, :,
digits
osmm - meta data about the object -- type, isClosed, version.

I try to preserve OSM data without much changes. Every tag's value is
stored as a string, except for wikidata and wikipedia tags which are
converted to a URL, the same format as stored in Wikidata.

osmway:29453885
  osmt:name "Samina";
  osmt:waterway "river";
  osmt:wikidata wd:Q156065;
  osmt:wikipedia ;
  osmm:type "w";    could be "r", "w", and "n"
  osmm:isClosed false;     this meta property is only for OSM ways
  osmm:version 24.

Wikidata data structure is identical to https://query.wikidata.org (see
help)


== Current limitations ==
* Only includes OSM objects with either "wikidata" or "wikipedia" tags
* The OSM data only contains tags with only Latin letters, digits and
symbols - : _
* OSM geometry info is not imported, e.g. no center point or bounding box,
except for osmm:isClosed (true/false) property for ways.
* Does not include OSM object inheritance data - e.g. cannot query for
"find a node that is part of a way which is part of a relation that has
wikidata tag that ..."
* Wikidata is updated every second, but OSM does not yet update at all,
imported from a full db dump as of a few days ago.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] new Wikidata+OSM data in one RDF database

2017-05-25 Thread Yuri Astrakhan
The service is back up, this time with all the objects that have tags.
Also, I added the "has" properties on a relation - indicating all objects
contained within the relation.  So now you can ask for a relation, that
contains a way, and both the relation and the way have the same wikidata ID
(something you cannot get from overpass):

http://tinyurl.com/k4vjkje

"has" could be in one of three forms:
?osmObject1  osmm:has  ?osmObject2   # obj1 contains obj2, no label is set
?osmObject1  osmm:has:inner  ?osmObject2  # can also be outer,
center_admin, etc.
?osmObject1  osmm:has:_  ?osmObject2  # the label is not simple ascii, and
should be fixed


On Mon, May 22, 2017 at 2:04 AM Janko Mihelić  wrote:

> Wow, I think this is a great milestone. Thanks!
>
> Now if only we can get a mixture of Wikidata's SPARQL and Overpass QL. A
> kind of a hybrid language between the two? Because Wikidata will probably
> never have the Overpass "in" or "around", which narrows the data down to a
> single country or county, or to a radius around something. I find that very
> useful.
>
> What if you connected to the Overpass API, ran the Overpass query, and
> then filtered the Wikidata data by the results of Overpass? Does that even
> make sense? For example: Overpass gives me all elements with a wikidata tag
> in a county, and then SPARQL can filter down the data to find all humans
> within that data. I think that's possible.
>
> Anyway, thanks for your service (although I think it's down right now).
>
> Janko
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] new Wikidata+OSM data in one RDF database

2017-05-25 Thread Yuri Astrakhan
P.S. I am trying to get OSM updater to work, so that OSM data is always up
to date, but pyosmium is giving me some trouble. Please email if you know
the answer to
https://stackoverflow.com/questions/44170360/callbacks-not-called-in-pyosmiums-diff-downloader

On Wed, May 24, 2017 at 11:50 PM Yuri Astrakhan <yuriastrak...@gmail.com>
wrote:

> The service is back up, this time with all the objects that have tags.
> Also, I added the "has" properties on a relation - indicating all objects
> contained within the relation.  So now you can ask for a relation, that
> contains a way, and both the relation and the way have the same wikidata ID
> (something you cannot get from overpass):
>
> http://tinyurl.com/k4vjkje
>
> "has" could be in one of three forms:
> ?osmObject1  osmm:has  ?osmObject2   # obj1 contains obj2, no label is set
> ?osmObject1  osmm:has:inner  ?osmObject2  # can also be outer,
> center_admin, etc.
> ?osmObject1  osmm:has:_  ?osmObject2  # the label is not simple ascii,
> and should be fixed
>
>
> On Mon, May 22, 2017 at 2:04 AM Janko Mihelić <jan...@gmail.com> wrote:
>
>> Wow, I think this is a great milestone. Thanks!
>>
>> Now if only we can get a mixture of Wikidata's SPARQL and Overpass QL. A
>> kind of a hybrid language between the two? Because Wikidata will probably
>> never have the Overpass "in" or "around", which narrows the data down to a
>> single country or county, or to a radius around something. I find that very
>> useful.
>>
>> What if you connected to the Overpass API, ran the Overpass query, and
>> then filtered the Wikidata data by the results of Overpass? Does that even
>> make sense? For example: Overpass gives me all elements with a wikidata tag
>> in a county, and then SPARQL can filter down the data to find all humans
>> within that data. I think that's possible.
>>
>> Anyway, thanks for your service (although I think it's down right now).
>>
>> Janko
>>
>> ___
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] new Wikidata+OSM data in one RDF database

2017-05-25 Thread Yuri Astrakhan
Thanks to quick fix by Sarah, the OSM updater is now working, and will
catch up shortly.  I am still looking for a permanent home for this
service, as I am pretty sure it would be highly useful especially for tag
analysis and data validation.

mmd, thanks!! I was asking earlier and was told that there is no way to do
complex comparisons between tags in different objects. I really hope we
could teach SPARQL to do all the amazing geometry work that Overpass can
do, but I suspect there are significant number of queries that are easier
to express in SPARQL.  If only we can join the two :)

On Thu, May 25, 2017 at 3:06 AM mmd <mmd@gmail.com> wrote:

> Hi,
>
> Am 25.05.2017 um 08:50 schrieb Yuri Astrakhan:
> > The service is back up, this time with all the objects that have tags.
> > Also, I added the "has" properties on a relation - indicating all
> > objects contained within the relation.  So now you can ask for a
> > relation, that contains a way, and both the relation and the way have
> > the same wikidata ID (something you cannot get from overpass):
> >
> > http://tinyurl.com/k4vjkje
> >
>
> Sure you can do this with overpass: http://overpass-turbo.eu/s/phj
>
> The example returns relation 416351 along with all ways having the same
> wikidata id. I commented out the part to check all relations in the
> current bounding box, but I guess you'll get the idea.
>
> best,
> mmd
>
>
>
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing 850+ disambiguation errors

2017-09-15 Thread Yuri Astrakhan
I just wrote a simple query to help finding all wikidata objects next to a
given OSM object.  Also, it is far better to delete a bad
Wikipedia/Wikidata tags than to keep the incorrect ones.  Thanks for all
the help!!!

https://wiki.openstreetmap.org/wiki/SPARQL_examples#Find_all_wikidata_items_near_the_specific_osm_object

On Thu, Sep 14, 2017 at 8:23 AM, alan_gr  wrote:

> Thanks for the map, until I saw it I wasn't really sure how to help with
> this. I think I have solved the small number of disambig issues on points
> located in Spain and Ireland (only about 15 points altogether).
>
>
>
> --
> Sent from: http://gis.19327.n8.nabble.com/General-Discussion-f5171242.html
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-19 Thread Yuri Astrakhan
There is now a relatively small number of OSM nodes and relations
remaining, that have wikipedia, but do not have wikidata tags. iD editor
already automatically adds wikidata to all new edits, so finishing up the
rest automatically seems like a good thing to do, as that will allow many
new quality control queries. I would like to auto-add all the corresponding
wikidata based on wikipedia, for all remaining objects, using  JOSM's
"Fetch Wikidata IDs".

This way, we will be able to quickly find all the objects that are
problematic with the Wikidata+OSM service. For example, thanks to the
community, we already fixed over 600 incorrect links to wiki
disambiguations pages, and this will find many more of them.  We will be
able to fix when things are tagged as people (e.g. wikidata -> person,
instead of subject:wikidata -> person), find location errors (e.g. wikidata
and OSM point to very different locations, implying that its an incorrect
link).

Some statistics (I don't plan to add wikipedia tags to those with only
wikidata)
Query: http://tinyurl.com/yafxe4co

has bothn   323,745
has bothw  137,097
has bothr   248,588

no wikidatan   68,330
no wikidataw  131,796
no wikidatar   11,602

no wikipedia  n   77,224
no wikipedia  w  47,402
no wikipedia  r   17,408
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-19 Thread Yuri Astrakhan
>
>
> > This way, we will be able to quickly find all the objects that are
> > problematic with the Wikidata+OSM service. For example, thanks to the
> > community, we already fixed over 600 incorrect links to wiki
> disambiguations
> > pages, and this will find many more of them.  We will be able to fix when
> > things are tagged as people (e.g. wikidata -> person, instead of
> > subject:wikidata -> person), find location errors (e.g. wikidata and OSM
> > point to very different locations, implying that its an incorrect link).
>
> The commonest error I have found is wikidata=Qnnn instead of
> brand:wikidata=Qnnn for franchises like McDonalds and petrol stations.
>
> Andy, I agree - there are many ones like that, all around the globe.  I
know that in Israel, @SwiftFast uses a template to keep them in sync for
gas stations and ATMs, but we need a more generic solution.

A simple query could find all wikidatas pointing to enterprises, producing
cases like this (these are just the first ones i found)  -
http://tinyurl.com/yby8564c
* nodes 192051528, 243017574 -- villages marked as ski resort, place=hamlet
or village
* node 285833428 - a ski resort with place=locality
* node 436622732 - a mountain AND a ski resort in WD, with natural=peak in
OSM
* relation 128277 - a commune and a ski resort, marked as admin boundary

An alternative would be to maintain a whitelist of all known brands, either
on a wiki, or as an additional wikidata "instance-of   Q431289".
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Putting simple scripts in the Wiki without violating CC BY-SA 2.0

2017-09-18 Thread Yuri Astrakhan
Hi, I would still highly advise putting it into git, because
* it's easier to discover by others, code search, etc
* it is far easier to propose changes, discuss them, track who submitted
what, etc
* it is easier to fork to try different things, and for others to see your
forks and possibly adapt them too

At the end of the day, wiki is a front end to a simple version control
system, whereas git is what most developers are used to nowadays.  I have
done a lot of "on-wiki" coding, and unless there are very good reasons to
keep it on wiki, it is far better to store it in repo.  Plus you wouldn't
have the licensing questions :)

As for license - you could put at the top that "this code is MIT licensed
to remove ambiguity, but IANAL.  My understanding is that by default, all
content is licensed as whatever it said at the bottom.

On Tue, Sep 19, 2017 at 1:14 AM, SwiftFast  wrote:

> I have a bot[1]. I'd like to publish its scripts. A versioning  system
> like GIT would be overkill, because the scripts are short and rarely
> changing.
>
> I'm not a lawyer, and I have some questions:
>
> 1. Suppose I don't state any license, would that implicitly the same
> license of the wiki itself?[2]
>
> 2. Can I explicitly state a license such as MIT/Apache/GPL? Would any
> of those licenses conflict with the license of the Wiki itself?
>
> Thanks!
>
> [1] https://wiki.openstreetmap.org/wiki/User:SwiftFast#SwiftFast_bot
> [2] https://wiki.openstreetmap.org/wiki/Wiki_content_license
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
>
> What will inevitably happen if you automatically add wikidata tags is
> that existing errors in either OSM (in form of incorrect wikipedia
> tags) or in wikidata (in form of incorrect connections to wikipedia
> articles) will get duplicated.
>

Christoph, a valid point. Yet the duplicate would allow finding many of
these errors, rather than leaving wp-only to go bad due to changing nature
of the WP articles. As for sameness argument - lets try to work on them on
a case-by-case basis. The vast majority of concepts are "good enough" - if
a park is tagged with the wikidata id for that park, and someone extends it
to add a few more trees, its not a big problem. If that edit combines two
parks into one, eventually it would get fixed, with two parks being
created.  And no, we won't be able to solve every edge case, but we will
solve it for the vast majority of them. After all, a map is an
approximation of the real world, not a perfect replica.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
>
> Don't assume such cases are just a freak anomaly - they are not.  OSM
> and wikidata are two very different projects which developed in very
> different contexts.  Just another example: For most cities and larger
> towns (at least in Germany) there exists an admin_level 6/8 unit with
> the same name and most of these seem to have a single wikidata item
> while in OSM we have two separate concepts for the populated place
> (place=city/town) and the administrative unit (boundary relation with
> boundary=administrative + admin_level=6/8).
>
> Christoph, Wikidata community has a project

https://www.wikidata.org/wiki/Wikidata:WikiProject_Municipalities_of_Germany

I think if you suggest it there, they will be happy to add it, allowing OSM
objects to be properly tagged. Or just contribute there :)
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
Also, there is a general country subdivision project with plenty of
information and current status.  I'm pretty sure OSM community has a lot of
good info to share:

https://www.wikidata.org/wiki/Wikidata:WikiProject_Country_subdivision

On Wed, Sep 20, 2017 at 1:28 PM, Yuri Astrakhan <yuriastrak...@gmail.com>
wrote:

> Don't assume such cases are just a freak anomaly - they are not.  OSM
>> and wikidata are two very different projects which developed in very
>> different contexts.  Just another example: For most cities and larger
>> towns (at least in Germany) there exists an admin_level 6/8 unit with
>> the same name and most of these seem to have a single wikidata item
>> while in OSM we have two separate concepts for the populated place
>> (place=city/town) and the administrative unit (boundary relation with
>> boundary=administrative + admin_level=6/8).
>>
>> Christoph, Wikidata community has a project
>
> https://www.wikidata.org/wiki/Wikidata:WikiProject_
> Municipalities_of_Germany
>
> I think if you suggest it there, they will be happy to add it, allowing
> OSM objects to be properly tagged. Or just contribute there :)
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
>
> While the WMF does not claim any rights in wikidata contents, it does not
> make any representations (one way or the other) as to third party rights in
> the data. As an illustration: you could dump all of OSM in to wikidata and
> the WMF would not need to change or do anything.
>

But the same works in reverse, doesn't it?  Wikidata project, just like WP
and OSM, is user contributable. If a user uploads data that violates
project's license, it should be deleted. And for that reason, both Wikidata
and OSM state the license under which the data is contributed and shared.
If I make an edit to OSM by copying data from Google, wouldn't that be the
same thing?


> (CC0), but the reverse depends on if the OSM contributor agreed to
> dedicate their edits to public domain.
>
> There is not really a practical and meaningful way in which an OSM
> contributor could do that, outside of facts that they have surveyed
> themselves and kept separate.
>

How I hate to diverge from the main topic, but alas... :)  This does sound
like a severe problem (that should be taken to a separate thread) - if I,
as a user, set the Public Domain checkbox, my assumptions are that my
contributions are PD. If I trace something based on some image data, I need
to specify that source, otherwise I am in violation of the source's
license. If I did not specify the source, and I checked the PD box, it can
be assumed that I am donating under PD. If this is not the case, it is a
violation of my contributor's rights - because otherwise my intention is
not being honored (i want other people to be able to use my work
unrestricted).  If anyone wants to comment, please start a new thread :)

>
> Without it, OSM data is licensed under ODbL, and cannot be copied. We
> should make it easier to detect what piece of OSM data is in PD.  I do like
> your USB analogy :) About names - you will be surprised to discover that MB
> and other places are actively pursuing Wikidata integration because WD
> tends to have a huge names list, possibly bigger than OSM itself?
>
>
> That is nice for MB, but problematic in more than one way for OSM.
>
Please elaborate, I know of at least one more company that is actively
doing that.   Sigh, another side topic :D

On Wed, Sep 20, 2017 at 1:58 PM, Simon Poole <si...@poole.ch> wrote:

> [turning on broken record mode :-)]
> On 20.09.2017 17:54, Yuri Astrakhan wrote:
>
>
>
> * Oleksiy, OSM can use any data from Wikidata because of the public domain
> dedication
>
> While the WMF does not claim any rights in wikidata contents, it does not
> make any representations (one way or the other) as to third party rights in
> the data. As an illustration: you could dump all of OSM in to wikidata and
> the WMF would not need to change or do anything.
>
> (CC0), but the reverse depends on if the OSM contributor agreed to
> dedicate their edits to public domain.
>
> There is not really a practical and meaningful way in which an OSM
> contributor could do that, outside of facts that they have surveyed
> themselves and kept separate.
>
> Without it, OSM data is licensed under ODbL, and cannot be copied. We
> should make it easier to detect what piece of OSM data is in PD.  I do like
> your USB analogy :) About names - you will be surprised to discover that MB
> and other places are actively pursuing Wikidata integration because WD
> tends to have a huge names list, possibly bigger than OSM itself?
>
>
> That is nice for MB, but problematic in more than one way for OSM.
>
> Simon
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
Such an awesome discussion, thanks!

* https://www.wikidata.org/wiki/Special:GoToLinkedPage can already be used
to open a Wikipedia page when you only have a Wikidata ID.  It even accepts
a list of wiki sites. For example, this link automatically opens the wiki
page for Q3669 in the first available language ("pt" in this case)

https://www.wikidata.org/wiki/Special:GoToLinkedPage?itemid=Q3669=enwiki,ptwiki

* Sarah, thanks for the heads up about Nominatim using Wikipedia tags.  I
recently added page popularity (pageviews) to the OSM+Wikidata service.
Another metric is the number of Wikipedia articles in different languages
per topic (sitelinks count).  Together, they can be used to calculate
relative weights.

* I am a bit radical, but not enough to propose we get rid wikipedia tags
just yet.  They sometimes provide a good indication of the original
intent.  Once Wikidata is used in all the tooling, we may revisit, but not
until then.  But yes, wikipedia tags are very unstable, especially when
articles get renamed because multiple places have identical names, thus
creating a link to disambig. So in general, they often go stale and become
less useful without any indication.

* Oleksiy, OSM can use any data from Wikidata because of the public domain
dedication (CC0), but the reverse depends on if the OSM contributor agreed
to dedicate their edits to public domain. Without it, OSM data is licensed
under ODbL, and cannot be copied. We should make it easier to detect what
piece of OSM data is in PD.  I do like your USB analogy :) About names -
you will be surprised to discover that MB and other places are actively
pursuing Wikidata integration because WD tends to have a huge names list,
possibly bigger than OSM itself?

* Christoph, a very valid point in general. Do you have any statistics on
how often multiple meanings per osm object is a problem? In my experience,
this is very rare, but hard to say without numbers.  For the case of the
island being both a country and a land feature, I think it would benefit
OSM to actually have two objects with the same geometry - e.g. two
relations containing the same way(s).  One relation would treat it as an
admin boundary, with all the related tags, the other - as a land feature.
Data consumers would treat them separately. Conflating tags related to both
concepts into one object is not very good.  In a more general terms, you
usually have three cases:
-- 1:1 (most common imo)
-- one osm obj being a part of larger page (e.g. a list of churches). I
don't think wikidata/wikipedia tag is appropriate in this case, as that
page is not about this specific object, but about a class of similar
objects. We could use listed-on:wp, or partof:wp, or some other tag.
-- Your case - multiple concepts for the same object. Use either a
semicolon separated list of wd ids, or (better) - create multiple relations
to describe multiple concepts.

* Frederik, that bit of a small personal attack is uncalled for. I exposed
a lot of existing bad data, not added it. And I created complex tooling to
help everyone resolve it as a community, rather than try to tackle all of
it by myself.  A system for fixing problems is always better than one
person doing it by hand, and later retiring because the challenge is too
great.  Also, corresponding wikidata tag is not a bad data - it is simply a
copy of the existing Wikipedia tag, making it easier for tools and humans
to find and fix. As for your last email - fetching *corresponding* wikidata
items is not an error - its a duplicate of an existing information. That
information might be incomplete, but that's a separate issue.

* Lester, I'm not sure I understood your Douglas Adams example, PM me and
lets try to figure it out. It might has to do with ranking of each statement

See also:
Feature request for any lang fallback:
https://phabricator.wikimedia.org/T176321
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
Tobias, agree 100%, thanks.

On Wed, Sep 20, 2017 at 12:14 PM, Tobias Knerr  wrote:

> On 20.09.2017 17:02, Christoph Hormann wrote:
> > It is best to regard the wikidata and wikipedia tags in OSM as 'related
> > features' rather than identical objects.
>
> We shouldn't dilute the definition of the key because some incorrect
> links exist in the database. If there's no 1:1 relationship between OSM
> element and Wikidata item, and this cannot be fixed by editing Wikidata,
> then no wikidata tag should be added.
>
> > These provide useful sources
> > to research additional information (in particular wikipedia articles
> > often link to additional sources)
>
> Sure, but the wikidata key is for "the Wikidata item about the feature",
> not any related content that may be interesting.
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
One other thing: lets not build walls between different projects. I know
its in a human nature to do that, but lets not.  In Wikipedia, every
language is also a separate project, and there I also saw a lot of "this is
not how we do things around here".

Each project is ran by people.  Most people contribute to more than one
project, so lets not say "they do X, but we do Y", because often "they and
us" is the same person. Obviously some rules differ, and we should respect
that, but (I hope) most of us dedicate our volunteer time because we
believe in the general principal:  making knowledge available to everyone
freely.  OSM concentrates on geographical knowledge. Wikidata - on
classification.  Wikipedia - on descriptions.  All three, as a sum, can be
much greater than each one separately.  Lets keep that in mind when we
think how we can coexist better, and how to reduce the overlap.  Keeping
duplicates in sync is always harder than to let the tools do their data
merging work if needed.

On Wed, Sep 20, 2017 at 12:18 PM, Yuri Astrakhan <yuriastrak...@gmail.com>
wrote:

> Tobias, agree 100%, thanks.
>
> On Wed, Sep 20, 2017 at 12:14 PM, Tobias Knerr <o...@tobias-knerr.de>
> wrote:
>
>> On 20.09.2017 17:02, Christoph Hormann wrote:
>> > It is best to regard the wikidata and wikipedia tags in OSM as 'related
>> > features' rather than identical objects.
>>
>> We shouldn't dilute the definition of the key because some incorrect
>> links exist in the database. If there's no 1:1 relationship between OSM
>> element and Wikidata item, and this cannot be fixed by editing Wikidata,
>> then no wikidata tag should be added.
>>
>> > These provide useful sources
>> > to research additional information (in particular wikipedia articles
>> > often link to additional sources)
>>
>> Sure, but the wikidata key is for "the Wikidata item about the feature",
>> not any related content that may be interesting.
>>
>> ___
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-20 Thread Yuri Astrakhan
>
> people fixing WD won’t necessarily check if their fixes work well with
> OSM. Maybe we should include versions in our WD tags?
> I’ve seen OSM objects linked from WD, are there people monitoring changes
> to linked objects?
>
Yes, that's what the Wikidata+OSM service is for. It allows community to
create queries that verify various aspects of OSM objects as related to
Wikidata Objects. For example, if Wikidata object changes its "instance-of"
to disambig, the query would immediatelly flag corresponding OSM object as
having a problematic wiki link.

>
> I think it’s better to add the WD links slowly, verifying on a one by one
> basis that the objects describe the same thing. And having this done for
> some time I can tell that quite often WD items are very basic and defined
> besides their name only by the content of their WP article links, which in
> different languages not always describe the same thing/s.
> If you look into the things there’s a lot to fix in both projects, adding
> WD tags automatically in one go might help less than people doing it
> carefully and fixing the problems on the way.
>
> iD editor has been doing exactly that for substantial time.  Whenever user
adds a wikipedia tag, corresponding Wikidata tag is added automatically. I
seriously doubt there are any (or any at all) people who check that
wikidata ID is correct by hand.  Yet the number of errors that were caught
by cross-linking the data is very significant.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Name challenge - what to call the new OSM+Wikidata service?

2017-09-16 Thread Yuri Astrakhan
The new service is getting more and more usage, but it lacks the most
important thing - a good name.  So far my two choices are:

* wikosm
* wikidosm

Suggestions?  Votes?  The service combines Wikidata and OpenStreetMap
databases, and uses SPARQL (query language) to search it, so might be good
to reflect that in the name.

https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service

P.S.  I know this is the hardest problem after off-by-one and caching...
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] JOSM now supports OSM+Wikidata service

2017-09-17 Thread Yuri Astrakhan
The "not yet fully named" service is now accessible directly from JOSM -
just like OT.  Simply install or update Wikipedia plugin, and it will show
up in the download data screen (expert mode).

Documentation:
https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service#Using_from_JOSM
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Name challenge - what to call the new OSM+Wikidata service?

2017-09-17 Thread Yuri Astrakhan
RichardF on IRC suggested "semaptic"
>  take the "semantic" word associated with wikidata and add moar maps
which seem to have gained a few support votes there.   Does it have any
negative meanings in any languages?

On Sun, Sep 17, 2017 at 6:15 PM, Jo <winfi...@gmail.com> wrote:

> SparklyMapData or SparklyDataMap
>
> 2017-09-17 23:46 GMT+02:00 Yuri Astrakhan <yuriastrak...@gmail.com>:
>
>> One thing we should consider is the domain name.  I doubt we can afford
>> woq.com :)
>>
>> These names were proposed
>> woq   2
>> wdoqs
>> woqs
>> q936
>>
>> And these proposed names have OSM in them, so likely are not good
>> according to Legal
>> wosm  2
>> wikosm
>> wdosm
>> wikidosm
>>
>> (P.S. Sorry for double - hit send too fast before fixing the first list)
>>
>> On Sun, Sep 17, 2017 at 5:45 PM, Yuri Astrakhan <yuriastrak...@gmail.com>
>> wrote:
>>
>>> One thing we should consider is the domain name.  I doubt we can afford
>>> woq.com :)
>>>
>>> These names were proposed
>>> woq   2
>>> wdoqs
>>> wdosm
>>> woqs
>>> q936
>>>
>>> And these proposed names have OSM in them, so likely are not good
>>> according to Legal
>>> wosm  2
>>> wikosm
>>> wikidosm
>>>
>>>
>>> On Sun, Sep 17, 2017 at 5:10 PM, Simon Poole <si...@poole.ch> wrote:
>>>
>>>> I believe there is a slight misunderstanding, while remixing
>>>> OpenStreetMap/OSM/etc in various ways may result in cutesy copycat domain
>>>> names they simply do not jibe well with reality.
>>>>
>>>> Not only does every single one of them weaken the standing of the marks
>>>> themselves and make is increasingly difficult to take action against
>>>> misuse, they are further uncontrollable liabilities for the whole
>>>> community. I gave the example of OpenWeatherMap, but there are others that
>>>> would be really painful if they ended up in the hands of your fav giant
>>>> tech corp.
>>>>
>>>> That said, I'm not sure why you believe the policy has broken
>>>> something, with the exception of a few local chapters, to my knowledge, the
>>>> OSMF has never granted a licence to anybody to use the marks in a domain
>>>> name. As outlined in the FAQ we will be operating a grandfathering scheme
>>>> to legalize such use after the fact so actually making such use legit for
>>>> the first time.
>>>>
>>>> And yes: WoQ would be a wonderful name for Yuris service and shows that
>>>> it is completely possible to break out of the old schema of simply copying
>>>> OSM.
>>>>
>>>> Simon
>>>>
>>>> Am 17.09.2017 um 15:57 schrieb Yves:
>>>>
>>>> So, no OpenSparqlMap, then? :(
>>>> Sad, this policy definitely broke something.
>>>> Yves
>>>>
>>>> Le 17 septembre 2017 12:58:12 GMT+02:00, Blake Girardot
>>>> <bgirar...@gmail.com> <bgirar...@gmail.com> a écrit :
>>>>>
>>>>> Hi,
>>>>>
>>>>> How does this relate to the new draft trademark policy?
>>>>>
>>>>> I can't tell from the draft policy, but I believe that OSM at least is
>>>>> a protected mark, not sure about osm.
>>>>>
>>>>> But I do think Simone Poole asked the community to stop naming things
>>>>> with osm trademarks in them or variations on openstreetmap phrase.
>>>>>
>>>>> Cheers
>>>>> blake
>>>>>
>>>>> On Sat, Sep 16, 2017 at 5:11 PM, Yuri Astrakhan <yuriastrak...@gmail.com> 
>>>>> <yuriastrak...@gmail.com> wrote:
>>>>>>
>>>>>>  The new service is getting more and more usage, but it lacks the most
>>>>>>  important thing - a good name.  So far my two choices are:
>>>>>>
>>>>>>  * wikosm
>>>>>>  * wikidosm
>>>>>>
>>>>>>  Suggestions?  Votes?  The service combines Wikidata and OpenStreetMap
>>>>>>  databases, and uses SPARQL (query language) to search it, so might be 
>>>>>> good
>>>>>>  to reflect that in the name.
>>>>>>
>>>>>>  https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service
>>>>>>
>>>>>>  P.S.  I know this is the hardest problem after off-by-one and caching...
>>>>>>
>>>>>> --
>>>>>>
>>>>>>  talk mailing list
>>>>>>  talk@openstreetmap.org
>>>>>>  https://lists.openstreetmap.org/listinfo/talk
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> ___
>>>> talk mailing 
>>>> listtalk@openstreetmap.orghttps://lists.openstreetmap.org/listinfo/talk
>>>>
>>>>
>>>>
>>>> ___
>>>> talk mailing list
>>>> talk@openstreetmap.org
>>>> https://lists.openstreetmap.org/listinfo/talk
>>>>
>>>>
>>>
>>
>> ___
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Name challenge - what to call the new OSM+Wikidata service?

2017-09-17 Thread Yuri Astrakhan
One thing we should consider is the domain name.  I doubt we can afford
woq.com :)

These names were proposed
woq   2
wdoqs
woqs
q936

And these proposed names have OSM in them, so likely are not good according
to Legal
wosm  2
wikosm
wdosm
wikidosm

(P.S. Sorry for double - hit send too fast before fixing the first list)

On Sun, Sep 17, 2017 at 5:45 PM, Yuri Astrakhan <yuriastrak...@gmail.com>
wrote:

> One thing we should consider is the domain name.  I doubt we can afford
> woq.com :)
>
> These names were proposed
> woq   2
> wdoqs
> wdosm
> woqs
> q936
>
> And these proposed names have OSM in them, so likely are not good
> according to Legal
> wosm  2
> wikosm
> wikidosm
>
>
> On Sun, Sep 17, 2017 at 5:10 PM, Simon Poole <si...@poole.ch> wrote:
>
>> I believe there is a slight misunderstanding, while remixing
>> OpenStreetMap/OSM/etc in various ways may result in cutesy copycat domain
>> names they simply do not jibe well with reality.
>>
>> Not only does every single one of them weaken the standing of the marks
>> themselves and make is increasingly difficult to take action against
>> misuse, they are further uncontrollable liabilities for the whole
>> community. I gave the example of OpenWeatherMap, but there are others that
>> would be really painful if they ended up in the hands of your fav giant
>> tech corp.
>>
>> That said, I'm not sure why you believe the policy has broken something,
>> with the exception of a few local chapters, to my knowledge, the OSMF has
>> never granted a licence to anybody to use the marks in a domain name. As
>> outlined in the FAQ we will be operating a grandfathering scheme to
>> legalize such use after the fact so actually making such use legit for the
>> first time.
>>
>> And yes: WoQ would be a wonderful name for Yuris service and shows that
>> it is completely possible to break out of the old schema of simply copying
>> OSM.
>>
>> Simon
>>
>> Am 17.09.2017 um 15:57 schrieb Yves:
>>
>> So, no OpenSparqlMap, then? :(
>> Sad, this policy definitely broke something.
>> Yves
>>
>> Le 17 septembre 2017 12:58:12 GMT+02:00, Blake Girardot
>> <bgirar...@gmail.com> <bgirar...@gmail.com> a écrit :
>>>
>>> Hi,
>>>
>>> How does this relate to the new draft trademark policy?
>>>
>>> I can't tell from the draft policy, but I believe that OSM at least is
>>> a protected mark, not sure about osm.
>>>
>>> But I do think Simone Poole asked the community to stop naming things
>>> with osm trademarks in them or variations on openstreetmap phrase.
>>>
>>> Cheers
>>> blake
>>>
>>> On Sat, Sep 16, 2017 at 5:11 PM, Yuri Astrakhan <yuriastrak...@gmail.com> 
>>> <yuriastrak...@gmail.com> wrote:
>>>>
>>>>  The new service is getting more and more usage, but it lacks the most
>>>>  important thing - a good name.  So far my two choices are:
>>>>
>>>>  * wikosm
>>>>  * wikidosm
>>>>
>>>>  Suggestions?  Votes?  The service combines Wikidata and OpenStreetMap
>>>>  databases, and uses SPARQL (query language) to search it, so might be good
>>>>  to reflect that in the name.
>>>>
>>>>  https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service
>>>>
>>>>  P.S.  I know this is the hardest problem after off-by-one and caching...
>>>>
>>>> --
>>>>
>>>>  talk mailing list
>>>>  talk@openstreetmap.org
>>>>  https://lists.openstreetmap.org/listinfo/talk
>>>
>>>
>>>
>>>
>>
>> ___
>> talk mailing 
>> listtalk@openstreetmap.orghttps://lists.openstreetmap.org/listinfo/talk
>>
>>
>>
>> ___
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Name challenge - what to call the new OSM+Wikidata service?

2017-09-17 Thread Yuri Astrakhan
One thing we should consider is the domain name.  I doubt we can afford
woq.com :)

These names were proposed
woq   2
wdoqs
wdosm
woqs
q936

And these proposed names have OSM in them, so likely are not good according
to Legal
wosm  2
wikosm
wikidosm


On Sun, Sep 17, 2017 at 5:10 PM, Simon Poole <si...@poole.ch> wrote:

> I believe there is a slight misunderstanding, while remixing
> OpenStreetMap/OSM/etc in various ways may result in cutesy copycat domain
> names they simply do not jibe well with reality.
>
> Not only does every single one of them weaken the standing of the marks
> themselves and make is increasingly difficult to take action against
> misuse, they are further uncontrollable liabilities for the whole
> community. I gave the example of OpenWeatherMap, but there are others that
> would be really painful if they ended up in the hands of your fav giant
> tech corp.
>
> That said, I'm not sure why you believe the policy has broken something,
> with the exception of a few local chapters, to my knowledge, the OSMF has
> never granted a licence to anybody to use the marks in a domain name. As
> outlined in the FAQ we will be operating a grandfathering scheme to
> legalize such use after the fact so actually making such use legit for the
> first time.
>
> And yes: WoQ would be a wonderful name for Yuris service and shows that it
> is completely possible to break out of the old schema of simply copying OSM.
>
> Simon
>
> Am 17.09.2017 um 15:57 schrieb Yves:
>
> So, no OpenSparqlMap, then? :(
> Sad, this policy definitely broke something.
> Yves
>
> Le 17 septembre 2017 12:58:12 GMT+02:00, Blake Girardot
> <bgirar...@gmail.com> <bgirar...@gmail.com> a écrit :
>>
>> Hi,
>>
>> How does this relate to the new draft trademark policy?
>>
>> I can't tell from the draft policy, but I believe that OSM at least is
>> a protected mark, not sure about osm.
>>
>> But I do think Simone Poole asked the community to stop naming things
>> with osm trademarks in them or variations on openstreetmap phrase.
>>
>> Cheers
>> blake
>>
>> On Sat, Sep 16, 2017 at 5:11 PM, Yuri Astrakhan <yuriastrak...@gmail.com> 
>> <yuriastrak...@gmail.com> wrote:
>>>
>>>  The new service is getting more and more usage, but it lacks the most
>>>  important thing - a good name.  So far my two choices are:
>>>
>>>  * wikosm
>>>  * wikidosm
>>>
>>>  Suggestions?  Votes?  The service combines Wikidata and OpenStreetMap
>>>  databases, and uses SPARQL (query language) to search it, so might be good
>>>  to reflect that in the name.
>>>
>>>  https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service
>>>
>>>  P.S.  I know this is the hardest problem after off-by-one and caching...
>>>
>>> --
>>>
>>>  talk mailing list
>>>  talk@openstreetmap.org
>>>  https://lists.openstreetmap.org/listinfo/talk
>>
>>
>>
>>
>
> ___
> talk mailing 
> listtalk@openstreetmap.orghttps://lists.openstreetmap.org/listinfo/talk
>
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] JOSM now supports OSM+Wikidata service

2017-09-18 Thread Yuri Astrakhan
At the moment, I do not parse semicolon-separated values, but store them as
strings. In theory, it should be possible to split them into URIs and use
them for matching, but it will be painful.   On the other hand, it should
be fairly easy to parse them during upload - RDF DBs work well with
multiple values per statement.  I will adjust my parser and next time I do
a big import, I will update it.  But I am still looking for a server to
host it, so might take a bit of time.

SPARQL language took me a bit to understand, until I went back and simply
read the very few basics - then it all made sense all of a sudden. At the
end, its just a single giant table with 3 columns. :)

On Mon, Sep 18, 2017 at 3:44 AM, Jo <winfi...@gmail.com> wrote:

> That is wonderful news! It will take a while before I get used to that
> query language though.
>
> Does it also work if an object has a semicolon separated list of wikidata
> items in for example subject:wikidata? A statue with more than one person
> in it, for example?
>
> Polyglot
>
> 2017-09-18 7:28 GMT+02:00 Yuri Astrakhan <yuriastrak...@gmail.com>:
>
>> The "not yet fully named" service is now accessible directly from JOSM -
>> just like OT.  Simply install or update Wikipedia plugin, and it will show
>> up in the download data screen (expert mode).
>>
>> Documentation:
>> https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_qu
>> ery_service#Using_from_JOSM
>>
>> ___
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] A thought on bot edits

2017-10-06 Thread Yuri Astrakhan
Speaking from my Wikipedia bot experience (I wrote bots and created
Wikipedia API over 10 years ago to help bots):

Bots were successful in Wikipedia because all users felt empowered. Users
could very easily see what the bot edited, fix or undo bot edits, and
easily communicate with the bot authors.  OSM does not have as good of
tools to compare and undo. Hence, some users in OSM may feel powerless -
they feel like they cannot influence this process, e.g. easily undo a
mistake, or know how bad the mistake really is - does it affect just a few
or thousands of places? As OSM gets more contributors, and moves more
towards maintenance, we should address these two:

* There is no easy way to view changes side-by-side at osm.org. We need to
be able to view both the object history and the entire changeset history,
and compare any two revisions. The diff view should show geometry changes
together with tag changes. JOSM has a good diff viewer, but it is per
object, and requires the use of the app.
* There is no easy way to undo a specific edit. In Wikipedia, undoing is a
simple two click process - "undo this change" in the history view, "save".
In OSM, one has to use a JOSM plugin!

Note that some of these capabilities may exist as separate tools, but most
users may not be even aware of them. They need to be part of the OSM.org.

A few more comments:

* Don't confuse maintenance bots with batch imports. Maintenance bots
cleans up obvious mistakes and simplify things that are too tedious for
humans.  Batch import add large amount of sometimes unverified data. M-bot
cleans up wikipedia page redirects. Import bots create "botopedias" like
ceb-wiki.

* Assume the good faith - bot authors care about the project as much as
everyone else, and want to make the project better as much as everyone
else. Lets find solutions that benefit everyone.

* Bots are tools, just like JOSM. They can be used for good and cause
problems. Banning JOSM just because someone could use it badly doesn't make
sense. Instead we should encourage bot operators to contribute, but make
sure they are benefit rather than nuisance.

On Fri, Oct 6, 2017 at 4:40 AM Jo  wrote:

> True indeed. What this means, is that there can be a 'mismatch' between
> the Wikipedia tag and the Wikidata tag, if the Wikidata tag is more
> specific than what Wikipedia wants to create pages for.
>
> It's normal that this happens, as both projects have a different notion of
> notability. Aldi Nord and Aldi Süd will definitely not be the only cases of
> this. In fact I would expect this to happen very often.
>
> At least to me it happens quite a lot that I want to create an article on
> Wikpedia, but the powers that be don't consider the subject notable.
>
> Often this is a person with a street named after him or her. Or a bus
> line. But it could be a single statue in a park, or a part of a collection
> in a museum. So there will be many things we map that will have Wikidata
> items, but not Wikipedia articles. And some where our information is more
> specific that what WP has. Wikidata is actually an opendata project that
> stands closer to OSM than WP, or it certainly can be.
>
> Polyglot
>
> 2017-10-06 10:18 GMT+02:00 Martin Koppenhoefer :
>
>> 2017-10-06 10:10 GMT+02:00 Jo :
>>
>>> What I don't understand is the problems people seem to have with
>>> wikidata. If an existing wikidata entry doesn't align with what we mapped,
>>> then create a new wikidata entry that does and link it to the existing
>>> entries.
>>>
>>
>>
>> it's actually not that easy. I tried to do this and gave up (in the
>> infamous ALDI case). Andy Mabbett had created 1 new "sub-entity" for each
>> of the 2 enterprises which together are described in the wikipedia article,
>> but you cannot add the wikipedia article to the new wikidata object without
>> removing it from the other wikidata object (for both). As the wikidata
>> object that covers both enterprises is the best fit for the WP article, I
>> decided to keep the Wikipedia article linked to this, but then it didn't
>> make sense to use the more precise wikidata object as reference in OSM as
>> it hadn't any wikipedia article linked to it.
>>
>> Cheers,
>> Martin
>>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Licence compatibility (was Adding wikidata tags to the remaining objects with only wikipedia tag)

2017-10-02 Thread Yuri Astrakhan
Interesting question, especially considering that all other external data
sources have much more restrictive license - e.g. mapilary id or any
url/website tag (which is technically also an ID into another data
source)...
Also, what about the location where data is combined?  E.g. if wikidata is
in public domain, and US courts agree with that statement, anyone in the US
can combine it with OSM data?  What about UK?   In any case, i suspect
nothing we decide has any merit until the actual court case in any of the
locations.

On Mon, Oct 2, 2017 at 3:58 AM, Andy Townsend <ajt1...@gmail.com> wrote:

> On 02/10/2017 02:56, Paul Norman wrote:
>
>> On 10/1/2017 5:39 PM, Yuri Astrakhan wrote:
>>
>>> Lastly, if the coordinates are different, you may not copy it from OSM
>>> to Wikidata because of the difference in the license.
>>>
>>
>> Just for clarity and anyone reading the archives later, copying from
>> Wikidata to OSM is also a problem because Wikidata permits coordinate
>> sources like Wikipedia or Google Earth.
>>
>> _
>>
> Would a data consumer be able to legally combine OSM and
> wikipedia/wikidata data in any meaningful way, given this fundamental
> licence incompatibility?  What requirements would they have to fulfil with
> their combined data?  The OSM side of things is discussed in some detail at
> http://wiki.osmfoundation.org/wiki/Licence/Licence_and_Legal_FAQ , but
> what about requirements due to other data in there?
>
> Best Regards,
>
> Andy
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-03 Thread Yuri Astrakhan
Martin, while it is fascinating to learn about Aldi, its history, and
possible ways to organize information about it, isn't it a moot point for
our  discussion?  We are talking about Wikipedia, and how we link to it.
There is only one Aldi Wikipedia article that can be connected to:

* German
https://www.wikidata.org/wiki/Special:GoToLinkedPage?itemid=
Q125054=dewiki

* English
https://www.wikidata.org/wiki/Special:GoToLinkedPage?itemid=
Q125054=enwiki

This is the current behavior of the iD editor: you type in Wikipedia page,
and it automatically updates Wikidata ID, storing both values.  If you
think this is incorrect, please start a discussion, and we may want to
change that.  But this has been the automatic software behavior for a long
time. Most iD users would not even know that they have updated Wikidata
tag, so lets not treat "wikidata" as some magical unicorn that links to
something bigger and better - it is simply a link to Wikipedia.

It is really up to the software to generate a proper link to Wikipedia.  It
could be generated just like I showed above, or by transforming wikipedia
tag, hoping that the page is still the same.  In either case, you only get
a link.

On Mon, Oct 2, 2017 at 6:28 PM, Martin Koppenhoefer 
wrote:

>
>
> sent from a phone
>
> On 2. Oct 2017, at 20:36, Frederik Ramm  wrote:
>
> and Andy Mabbett from England editing supermarkets in
> Germany.
>
>
>
> indeed it’s not helping the quality if editors are not familiar with the
> language specifics for the area of the things they edit (this is true for
> all UGC, be it osm, wikidata, etc). Aldi Sud does not make sense, it’s
> either Süd or, if you really have to (e.g. domain names), Sued.
> https://www.wikidata.org/wiki/Q41171672
>
> This kind of fiddling leads to objects like this: https://www.wikidata.
> org/wiki/Q125054
> inception 1913
> founded by Karl and Theo Albrecht, born 1920 and 1922.
> Founded 7/9 years before their birth?
>
> It is also not true that aldi nord and süd result or follow from the
> splitting of Aldi, they result from the split of Albrecht KG. Not even the
> founding year 1960 for the parts is correct, it’s 1961 (according to wp and
> company website)
>
> It also still claims Aldi is a GmbH & Co. KG and even has 1 reference for
> this (german wikipedia), while the German wikipedia actually has a long
> paragraph trying to explain the structure and saying there are 66 different
> regional GmbH & Co. KG, plus other companies like the
> ALDI Einkauf GmbH & Co. oHG or the *ALDI SÜD
> Dienstleistungs-GmbH & Co. oHG*, i.e. it’s a group of companies, a
> concern.
> Here’s a list of parts of Aldi Süd:
> https://unternehmen.aldi-sued.de/de/impressum/
>
> It can all be fixed of course, but I’m curious how all these errors have
> gotten there. There’s still more wrong than correct in this object.
>
> cheers,
> Martin
>
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Could we just pause any wikidata edits for a month or two?

2017-10-03 Thread Yuri Astrakhan
While I have nothing against pausing bulk wikidata additions for a month,
we should be very clear here:
* several communities use bots to maintain and inject these tags, e.g.
Israel. Should they pause their bots?
* If a specific community is ok with it, does it override world wide ban
for that location?
* Has anyone actually been doing world-wide bulk wikidata additions ever
since this discussion has restarted after what I thought was a settled
matter about two weeks ago?

On Tue, Oct 3, 2017 at 4:58 PM, michael spreng 
wrote:

> Yes I support a pause. I feel that currently one side tries to outgun
> the other with rather brute force mechanical editing.
>
> Michael
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-15 Thread Yuri Astrakhan
 this instead and/or marking
> it as a false positive. In any case, the marker may not be shown for
> other users anymore. This was a topic in this thread already and it was
> voiced that inventing new tags just to be used by this tool in not
> acceptable and I agree with that. The other tools also do not require that.
>
> b) I strongly suggest to offer different answer options. As I said, if
> only one option is available, it is really nothing else than a manually
> operated automatic edit. If several options are available (i.e.
> "american football", "soccer" etc. ) as a quick fix, only then the tool
> becomes to be useful. (There are some challenges like that on
> MapRoulette also, such as "Phone or fax number is not in international
> format" and these in my opinion also do not belong there because they
> can be solved automatically)
>
> c) Require users to zoom into the map at around zoom 17 or more to make
> any changes. If the users are supposed to check if something is the case
> (via satellite image), then at least don't let them cheat by just
> solving everything from looking at continents.
>
> d) Finally, I think it does not make sense to have any quick fixes in
> that tool that require actually going there (as opposed to looking at
> the satellite imagery) because the effort to go there actually (let's
> say 20min if you happen to live in the vicinity) is dimensions higher
> than clicking on the "Save" button (1 second). The temptation will be
> big to simply click on that button without actually checking it. If you
> actually go there and check, then, the 1 minute as opposed to 1 second
> you need to get the surveyed result into the map through iD/JOSM does
> not really matter in comparison.
>
> All in all, in my opinion, the best way to go forward from here is to
> take this idea of quick fixes and instead of creating an own tool that
> is otherwise very similar to MapRoulette (because it must for being
> useful, see above), propose it as a feature to MapRoulette, discuss and
> implement it together in accord with the MapRoulette team into their
> tool (or Osmose for that matter). It's all open source.
>
> That feature could look like that the creator of a MapRoulette challenge
> may optionally provide a range of possible (typical) answer options
> ("quick fixes") which are then shown as additional buttons right next to
> [Edit], [False Positive] and [Skip] for every place within a challenge.
> I.e. for football, it could be a dropdown of soccer, american_football etc.
>
> Tobias
>
> On 13/10/2017 23:25, Yuri Astrakhan wrote:
> > I would like to introduce a new quick-fix editing service.  It allows
> > users to generate a list of editing suggestions using a query, review
> > each suggestion one by one, and click "Save" on each change if they
> > think it's a good edit.
> >
> > For example, RU community wants to convert  amenity=sanatorium  ->
> > leisure=resort + resort=sanatorium.  Clicking on a dot shows a popup
> > with the suggested edit. If you think the edit is correct, simply click
> > Save.
> > Try it:  http://tinyurl.com/y8mzvk84
> >
> > I have started a Quick fixes wiki page, where we can share and discuss
> > quick fix ideas.
> > * Quick fixes <https://wiki.openstreetmap.org/wiki/Quick_fixes>
> > * Documentation
> > <https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_
> SPARQL_query_service#Quick-fix_Editor>
> >
> > This is a very new project, and bugs are likely. Please go slowly, and
> > check the resulting edits. Let me know if you find any problems. Your
> > technical expertise is always welcome, see the code
> > at https://github.com/nyurik/wikidata-query-gui  The service has adapted
> > some code from the Osmose project (thanks!)
> >
> > TODO:
> > * Allow multiple edits per one change set
> > * Show objects instead of the dots
> > * Allow users to change comment before saving
> >
> >
> > ___
> > talk mailing list
> > talk@openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/talk
> >
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-15 Thread Yuri Astrakhan
Andy, first off, this whole email thread was about "hi, this is a new,
rough around the edges tool I'm building, that MIGHT benefit SOME people"
Suggestions/ideas welcome.  When you say "stop" - stop what? Stop coding? I
have not done any significant amount of editing since last month, right
around the time when Wikidata debacle happened. Also, I think you are
ignoring what I said - others who are NOT on this thread have also
expressed support - and they have not spoken here, possibly due to the
toxic environment. When you say we voted - voted on what? Was there a
proposal? Was there a bad action being performed?

Re toxicity - these are not my words to qualify what has been going on, but
I do agree with them.  I cannot say that I have been perfect in all of my
responses, but I do try to keep them civil, and expect the same back.
Also, please reread my previous email, I tried to carefully express most
issues I encountered.

John, noone is saying its a consensus either way.  I am WORKING on finding
a compromise, while also building a tool that at least some people in some
communities find useful.   I did draw a comparison to at least two tools -
MapRoullette and Osmose. The first - mostly about building "challenges",
the second - in how it presented the editing interface. Osmose has many
more features than I do, allowing greater editing experience.  Also, I do
draw a comparison with RawEdit (Osmose) and Level0 - they both allow very
quick text editing of the raw XML content - a highly error-prone process
that I try to avoid.

On Sun, Oct 15, 2017 at 8:33 AM, john whelan <jwhelan0...@gmail.com> wrote:

> > I have received praises on OSM-RU channel,
>
> Wow, within the large number of active mappers there is a very broad range
> of opinions.  One or two people saying this is wonderful is not a
> consensus within OpenStreetMap that it is.
>
> Within OpenStreetMap the authority is normally accepted to be the local
> mappers and the DWG.  There are processes to follow for automated edits.
>
> I seem to recall you drawing a comparison between your work and
> MapRoulette.  MapRoulette identifies problems and has mappers resolve
> them one at a time.  My understanding is it does not make changes or add
> anything to the database.  Your approach seems to be quite different and I
> don't think the two can be compared.
>
> My understanding is you have not followed these processes and have ignored
> the wishes of local mappers.  This is not a personal attack these are
> issues and concerns.
>
> Could you be so kind as to address the issues and concerns raised please.
>
> Especially not taking into account the wishes of the local communities
> when expressed.
>
> Many Thanks
>
> Cheerio John
>
>
>
>
>
> On 15 October 2017 at 08:04, Yuri Astrakhan <yuriastrak...@gmail.com>
> wrote:
>
>> Andu, with all due respect, you are misrepresenting things.  I have
>> received praises on OSM-RU channel, and that's where I got my first bug
>> reports and suggestions that were quickly fixed.  The current mailing
>> thread also received a praise from Steve. I have received a private email
>> explicitly praising this tool, some twitter feedback, plus, some general
>> encouragements for my efforts. So, despite some vocal people on one side of
>> the issue, claiming to represent "the community" is not accurate, as others
>> have expressed opposing views.  Thus, it is not as uniform as you try to
>> portray it, but rather, as any other conflict, deserves a thoughtful
>> approach to attempt to balance goals of everyone, and to find a valuable
>> compromise.
>>
>> At the same time, judging from the fact that someone did not feel
>> comfortable emailing to the group, there seems to be significant toxicity
>> and bullying going on.  There was a number of personal attacks, which to me
>> seems to be a violation of our communication policies, and which I
>> deliberately ignore. So no, I see some people in the community may support
>> it, but do not want to participate in such a violent discussion. When
>> someone is foaming at the mouth, people tend to stay away, rather than
>> engage in a constructive discussion.
>>
>> Luckily, there has been some valuable feedback too, and I hope our
>> community will be mature enough to provide more broadly.  For example,
>> Simon was very clear and explicit about the exact deficiencies he objected
>> to - something that I attempted to rectify, and will continue to improve
>> on.  Some other remarks, despite being presented in a bad form, lead me to
>> more good fixes such as a mandatory high zoom before editing. I am clearly
>> continuing to participate in the discussion, and try to

Re: [OSM-talk] New OSM Quick-Fix service

2017-10-15 Thread Yuri Astrakhan
Andu, with all due respect, you are misrepresenting things.  I have
received praises on OSM-RU channel, and that's where I got my first bug
reports and suggestions that were quickly fixed.  The current mailing
thread also received a praise from Steve. I have received a private email
explicitly praising this tool, some twitter feedback, plus, some general
encouragements for my efforts. So, despite some vocal people on one side of
the issue, claiming to represent "the community" is not accurate, as others
have expressed opposing views.  Thus, it is not as uniform as you try to
portray it, but rather, as any other conflict, deserves a thoughtful
approach to attempt to balance goals of everyone, and to find a valuable
compromise.

At the same time, judging from the fact that someone did not feel
comfortable emailing to the group, there seems to be significant toxicity
and bullying going on.  There was a number of personal attacks, which to me
seems to be a violation of our communication policies, and which I
deliberately ignore. So no, I see some people in the community may support
it, but do not want to participate in such a violent discussion. When
someone is foaming at the mouth, people tend to stay away, rather than
engage in a constructive discussion.

Luckily, there has been some valuable feedback too, and I hope our
community will be mature enough to provide more broadly.  For example,
Simon was very clear and explicit about the exact deficiencies he objected
to - something that I attempted to rectify, and will continue to improve
on.  Some other remarks, despite being presented in a bad form, lead me to
more good fixes such as a mandatory high zoom before editing. I am clearly
continuing to participate in the discussion, and try to abstain from
discussing PEOPLE, but instead concentrate on a specific IDEA being
presented in this thread, and the specific PROBLEMS it tries to solve. As a
volunteer. Without any financial benefit from anything I do. Same as many
other participants on this channel, regardless of their views. Trying to
maneuver between the abstract philosophy, various believes of what is the
"right thing to do", and the specific problems and solutions.

P.S. @mmd, sorry for not replying earlier. I suspect you meant it as an "ad
absurdum" argument. Thing is, Wikidata does use wiki pages to store bot
states. Mostly bots generate various talk pages and templates, and users
sometimes modify those talk pages to control the bots. Yet, this tool has
nothing to do with Wikidata, so it is a moot point to discuss storing OSM
metadata there. See my reply about the "nobot" tag. I think it would help
to partially heal the bot-nobot divide, as it gives control over each
object to editors, and allows mini-bots.

And one last thing.  Something that has helped me many times to find
COMPROMISE in a forum discussion. When replying, let's try to sum up the
opponent's position and the reasons for that position, and explain why you
think it is incorrect. Perhaps we should learn from the high school debate
class?  Sorry for the long email.

On Sun, Oct 15, 2017 at 6:38 AM, ajt1...@gmail.com <ajt1...@gmail.com>
wrote:

> On 15/10/2017 11:04, Christoph Hormann wrote:
>
>> On Sunday 15 October 2017, Yuri Astrakhan wrote:
>>
>>> [...] I was following up on the Christoph Hormann's
>>>>> idea of the "bot=no" tag, to "allow mappers to opt out of bot
>>>>> edits on a case-by-case basis."
>>>>>
>>>> No, you were not, likely because you misunderstood my suggestion
>>>> which is likely because you don't get how OpenStreetMap is working
>>>> overall.
>>>>
>>>> I would strongly advise you to reconsider your whole approach to
>>>> OpenStreetMap and to interacting with the OpenStreetMap community.
>>>>
>>> Christoph, kindly explain, instead of making snide remarks. You have
>>> not added to the discussion, but instead raised the level of toxicity
>>> of this channel even further.  Note that several people have already
>>> noted that this channel is toxic and refused to participate in it,
>>> rather than being productive and beneficial to everyone involved.
>>>
>> I rest my case.
>>
>>
> Yuri,
>
> In English there is a common phrase "which part of  *** do you not
> understand" (expletive removed because people offended by such words may be
> reading).
>
> This thread https://lists.openstreetmap.org/pipermail/talk/2017-October/
> thread.html#79145 currently has replies from 9 people.  1 is asking a
> question but all other replies are entirely negative (including comments
> such as "I'm appalled" and "that isn't acceptable behaviour").
>
> Christoph Hormann'

Re: [OSM-talk] Could we just pause any wikidata edits for a month or two?

2017-10-15 Thread Yuri Astrakhan
If a community has had a well established and agreed process running, which
does not create any new data issues, why should someone outside of that
community be requesting a global halt?  It's not like the data is getting
worse all of a sudden, right? And their work does not prevent global
community from reaching a consensus on how to move forward.  I suspect this
discussion may take a very long time to complete, so proposing to ban
various communities from doing what they have already been happily doing
because somewhere else something is being discussed is strange.  There are
over 40,000 users editing OSM, so reaching a consensus on a fundamental
topic of external DB linking might take years.

Has there ever been a global halt like this in OSM, where several people in
@talk demanded a certain tag to not be (mass) edited globally?  I'm totally
ok if there is a process for that, but global halt does seem a bit extreme
due to a relatively low impact.  After all, we are discussing philosophy of
the project here, not that tag X breaks half of the map renderers all of a
sudden, right?
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-15 Thread Yuri Astrakhan
Christoph, kindly explain, instead of making snide remarks. You have not
added to the discussion, but instead raised the level of toxicity of this
channel even further.  Note that several people have already noted that
this channel is toxic and refused to participate in it, rather than being
productive and beneficial to everyone involved.

On Sun, Oct 15, 2017 at 5:39 AM, Christoph Hormann <o...@imagico.de> wrote:

> On Sunday 15 October 2017, Yuri Astrakhan wrote:
> > [...] I was following up on the Christoph Hormann's
> > idea of the "bot=no" tag, to "allow mappers to opt out of bot edits
> > on a case-by-case basis."
>
> No, you were not, likely because you misunderstood my suggestion which
> is likely because you don't get how OpenStreetMap is working overall.
>
> I would strongly advise you to reconsider your whole approach to
> OpenStreetMap and to interacting with the OpenStreetMap community.
>
> --
> Christoph Hormann
> http://www.imagico.de/
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-15 Thread Yuri Astrakhan
Tobias, as promised, a thorough response.


On Sun, Oct 15, 2017 at 9:14 AM, Tobias Zwick  wrote:

>
> So, the initial question is: What is the conceptual use case for such a
> tool? Where would be its place in the range of available OSM tools?
>

I think my main target is the JOSM validator's "fix" button. The fix button
allows contributors to auto-fix everything that validator has found, even
without looking at it.  In order to actually see what the autofix did, one
has to select all modified objects, select them all in the "selection"
window, hit "history", wait for all individual objects to download, and
then view individual changes one by one.  It requires a great deal of
dedication and diligence, especially considering that these auto-changes
will be combined with all the other changes the user might have made.
While I trust that many OSM contributors are highly skilled, this
complexity may lead to errors, especially as some people might not know the
exact steps required to view it, cut corners, or think that the "fix"
button should know what it's doing.  Lastly, if I spot a bad autofix, I
have to go to the antiquated JOSM issue reporting site, create an account,
and file a bug. Not an easy endeavor for most of the users, so most would
probably not bother. So the "FIX" button is similar to my "SAVE" button -
users either catch it and do nothing, or they don't, and it gets saved, if
not by this person than by next.

There is the use case where one tagging scheme has been deprecated by
> community consensus and one (combination of) tag(s) should be changed
> into another (combination of) tag(s) globally.
>
> 1. If this does not require humans because both tagging schemes are
> mutually translatable (i.e. lets say for sport=handball <->
> sport=team_handball), then, the edit can be made automatically by a bot.
>

Here are a few of the existing JOSM autofixes done with my tool. See full
list at JOSM autofixes
.
* replace operator=ERDF -> operator=Enedis -- 5422 cases

* use  "cs:" instead of "cz:" prefix for Wikipedia links -- 3 cases

* fix duplicate Wikipedia tag prefixes, e.g.  "ru:ru:Something"  126 cases


While they probably should be ran by a bot, the barrier of entry is too
high to be realistic, especially for the smaller cases.  The very few
globally-licensed bot operators would probably not want to deal with these
small fixups, and for a very good reason - its not worth the risk! The
chance of a programming error far outweighs the benefits of the full
automation at so few objects. In addition to the programming error risks,
the community must have a far more thorough review of the proposal before
"bot-agreeing" to it - because what if there are corner cases that proposal
would break? This fear is what prevents the ease of bot adaption.

Lets look at a another example - a large 215,000+ cases autofix: removing
unnecessary "area=yes". These would greatly benefit from a bot edit, BUT
everyone makes coding mistakes, so there are some chances of a bad autofix.
If a bot owner makes a mistake, it can only be spotted AFTER running the
bot. A user would then post a message on the changeset, bot owner would
have to do a complex full/partial revert, fix the bot, and re-run it.
Painful. BTW, while doing these examples, I spotted a few potential bugs
with the existing JOSM autofixes that noone has reported - another reason
to put it through one-by-one accept/reject testing.

My tool would actually address these issues! When community first proposes
a change, it is relatively easy to add it to the tool - you simply write a
query and save it on a wiki page, possibly under the "proposed" section.
Then, many users can go through it one by one, accepting or rejecting them.
If there are rejects, anyone can go and fix the query, and the process
continues. Once this has been going on long enough, and there hasn't been
any rejects, some bot owner could simply run the exact same query on the
server, auto-applying it to the rest of the world. By that time, the query
has been well tested by many different members, and will be a much greater
quality than some bot author can ever do alone.


> 2. If this does require humans to check the transition to the new tag
> because the deprecated tagging scheme is ambiguous (i.e. , such as
> sport=football -> soccer or american/australian/canadian/... football),
> then, an automatic edit cannot be done. Instead, tools like MapRoulette
> are used.
>

I agree that my tool does not cover this use case yet.  I was thinking of
adding an option picker - a fairly easy task if the options are known in
advance, but this use case is not my primary target at the moment.


>
> 3. Finally, if this also does require humans because a tag combination
> is suspicious (what would show up as warnings in JOSM and what most of
> 

Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
Lester, the naming of this service is still a work in progress, and might
have confused a few people.  My apologies for that.  I do plan to create a
proper name, logo, domain name, and SSL certificate once I have some spare
time.  If anyone wants to take care of that, your help is appreciated.

The new tool is not about the Wikidata database. My database could contain
just the OSM data and this tool would be just as functional. The service
shares some of the technology that was built for Wikidata, and it has a
clone of the Wikidata database which some of the queries may optionally use
if they want to.  But if you look at the quick fixes page, there are 20
queries that do not use Wikidata, and only one query that does.  So again,
lets discuss the merits of this tool, just like Tobias did above, and leave
the Wikidata out of this conversation, as it is not relevant.

https://wiki.openstreetmap.org/wiki/Quick_fixes

On Mon, Oct 16, 2017 at 3:59 AM, Lester Caine  wrote:

> On 15/10/17 22:04, Frederik Ramm wrote:
> > You've built a query engine to work with OSM and
> > Wikidata and are pushing that relentlessly, to a point where the
> > decision of just how much Wikiadata linking we want in OSM is totally
> > taken out of our hands.
>
> Seconded ... personally I have still to be convinced that wikidata is an
> independent and reliable source, so will not be using it as a cross
> reference. It may well be that OSM needs it's own secondary database
> with the sort of cross references material SOME of which is slowly being
> added to wikidata but wikidata is an independent and unrelated project
> with it's own agenda ...
>
> --
> Lester Caine - G8HFL
> -
> Contact - http://lsces.co.uk/wiki/?page=contact
> L.S.Caine Electronic Services - http://lsces.co.uk
> EnquirySolve - http://enquirysolve.com/
> Model Engineers Digital Workshop - http://medw.co.uk
> Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
Martin, could you take a look at my last reply to Tobias - I have actually
expressed some concerns with bots in general (very surprisingly,
considering my heavy involvement with it previously).  I think this tool
can actually make the path to bots easier for the community, making new
bots safer and reduce the chance of an accidental programming bug -- Tobias
"case #1".

I do hear your concern about the "distributed auto-fix", and we should
minimize the negative effect.  I think the best way would be to have a good
community process to propose, evaluate, and approve such queries.
Currently, anyone can add create a challenge in MapRoulle without
oversight. Only a few devs might see changes in Osmose or the JOSM's
autofix (my reply to Tobias actually focuses on JOSM issue, and I think its
the most important issue).  Here, we could use our wiki to host all
queries, e.g. a "proposed" page where everyone will be instructed to be
very careful with the changes, as queries have not been extensively vetted,
and "approved" - queries that even bots can run automatically.  We could
have both locally oriented and globally oriented queries. Basically, if
anyone wants to cause havoc, they can do it with any tool, but if the
person wants to really help, we can guide that help towards the most
beneficial contribution. Lastly, it won't be too hard to track which query
was used - I can add the query ID to the changeset tag, so if an
experimental query starts getting mass-edited, it can be easily discovered.

Hope this helps.

On Mon, Oct 16, 2017 at 6:13 AM, Martin Koppenhoefer <dieterdre...@gmail.com
> wrote:

> Frederik:
>
>> I am appalled that after your abysmal OSM editing history where you more
>> often than not ignored existing customs rules, while *claiming* to
>> follow them, you're now building a service that entices others to do the
>> same.
>>
>
>
>
>> On Sat, Oct 14, 2017 at 6:09 AM Christoph Hormann <o...@imagico.de> wrote:
>>> This is a tool to perform automated edits as per the automated edits
>>> policy.  A resposible developer of such a tool should inform its users
>>> that making automated edits comes with certain requirements and that
>>> not following these rules can result in changes being reverted and user
>>> accounts being blocked.
>>>
>>
> 2017-10-14 13:06 GMT+02:00 Yuri Astrakhan <yuriastrak...@gmail.com>:
>
>> Christoph, I looked around Osmose and MapRoulette, and I don't see any
>> such warnings . Could you elaborate how you would like these kinds of tools
>> to promote good editing practices? Any UI ideas? I'll be happy to improve
>> our tools on making sure they meet community expectations.
>>
>
>
> I agree with Christoph and Frederik, that this is oviously a tool to
> perform (crowdsourced) automated edits, and although it is designed in a
> way to make them look like individual contributions, the automated editing
> guidelines should apply. I agree with Yuri that there is also (to some
> lesser extent, as the editing is not performed by the tool) some
> problematic potential in other QA tools like Osmose or "remote batch
> fixing" tools like MapRoulette.
>
> The thing with remotely "fixing tags" is that people usually don't know
> the situation on the ground and therefore hardly can make an individual
> decision for the specific object. The proposed "one-click-solution"
> encourages to do quick "fixes" without looking individually, and you even
> refuse to notify people that they might be participating in an automated
> edit. In examples like the one you gave, even if you look very hard, you
> won't see something that confirms the proposed change (you will have to
> know the place). I could imagine there are good cases where your tool can
> facilitate fixing problems, e.g. with clear typos (highway=residental), but
> changing from one tag to a combination of two is not one of them (either we
> could make an automated edit, or if it's disputed, we wouldn't do it at
> all, rather than sneaking it in via distributed automated editing).
>
> Cheers,
> Martin
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
Rory, thanks, and that's why I think it is a bad idea to do bot edits
without first running it through my tool.  If we do a mass edit, we have to
go through a very lengthy community consensus study, which might still miss
things. Then the bot developer might still make an error that is not likely
to be caught for quiet some time, until it is very hard to revert. On the
other hand, if a query is made, reviewed by community, and later many
people try going through it, accepting and rejecting changes, we will know
if we caught all the corner cases like the one you just gave. If noone has
rejected anything for a long time, a bot can simply pick up the query and
finish running it.  Much safer.

As for "community consensus" - TBH, very hard to define.

On Mon, Oct 16, 2017 at 7:49 AM, Rory McCann  wrote:

> On 15/10/17 15:14, Tobias Zwick wrote:
> > 1. If this does not require humans because both tagging schemes are
> > mutually translatable (i.e. lets say for sport=handball <->
> > sport=team_handball), then, the edit can be made automatically by a bot.
>
> Except that's not true. In Ireland "handball" is Gaelic Handball¹
> which is a one-on-one game, not a team sport (which is apparently a
> different thing²). There are some sport=handball's tagged in Ireland.
> Now the tag is clearly wrong, and we need to figure out something about
> that. But if you just change sport=handball to sport=team_handball, then
> you've entered incorrect data, based on incorrect assumptions.
>
> There is the use case where one tagging scheme has been deprecated by
>> community consensus and one (combination of) tag(s) should be changed
>> into another (combination of) tag(s) globally.
>>
>
> Big question: What defines "community consensus"?
>
> Though, note, for all three cases, a prior consensus is required, either
>> by prior discussion or by looking at what was previously agreed on in
>> the wiki. That is the case for *any* organized re-tagging of existing
>> tags.
>>
>
> Not everyone (incl me) thinks that the wiki defines what a tag should
> mean...
>
>
> ¹ https://en.wikipedia.org/wiki/Gaelic_handball
> ² https://en.wikipedia.org/wiki/Handball
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-17 Thread Yuri Astrakhan
Polyglot, I don't think there is a substantial **real** problem in JOSM
with the autofixes.  And yes, I have worked with JOSM devs and was
impressed at the speed of response.

The thing is - we have been discussing hypothetical issues so far, not the
real ones.   And hypothetically, allowing a simple way for users to review
one specific change on one object and click save is dangerous,  because
some hypothetical individual might not pay attention and go clicking
trigger happy -- in other words we don't trust our users to be diligent.
And so is **hypothetically** dangerous the JOSM's autofix feature,
especially because in JOSM, unilke the new tool, it is possible to click
"fix" without actually seeing what was changed.  On the other hand, JOSM's
autofix has been around for a long time, thus validating its own existence
by trial rather than by philosophy - or at least showing that for those
cases, the number of errors is so small, they don't surface to a major
community attention.

And that's the fundamental problem - we are worried about things that might
happen, at the expense of ignoring significant number of existing data
issues.  But I think this is fairly normal - after all, we are usually more
comfortable with a well understood existing problem than with a potential
unknown harm.

On Tue, Oct 17, 2017 at 4:13 AM, Jo <winfi...@gmail.com> wrote:

> If there would be real problems with autofixes in JOSM, it's easy to
> report those as a bugs or enhancement requests. JOSM's issue tracker may be
> antiquated, but it does work and JOSM's developers are very responsive.
>
> If JOSM users who apply these auto fixes would worsen the data, then they
> would get remarks through their changeset messages. I'm convinced that if
> there are real problems on that side, we would already know about them and
> they would be fixed very fast. Most likely by disabling the fix button for
> that particular validator warning.
>
> So if you find actual issues, please report them.
>
> Polyglot
>
> 2017-10-17 9:50 GMT+02:00 Yuri Astrakhan <yuriastrak...@gmail.com>:
>
>> Well, you kind of can fix one with the other - by introducing a better
>> tool and disabling some of the autofixes in JOSM (very easy to do).  A more
>> complex approach would clearly require a separate topic(s) and a
>> substantial dev involvement.
>>
>> P.S. No, https://master.apis.dev.openstreetmap.org/ doesn't have any
>> real data (it shows maps from live servers, but editing shows just a few
>> objects).
>>
>> On Tue, Oct 17, 2017 at 3:36 AM, Tobias Zwick <o...@westnordost.de> wrote:
>>
>>> I get your point, especially regarding the appliance of the JOSM
>>> fix-button as a "by-the-way" fixing.
>>>
>>> Though, you can't fix possible issues with of one tool by introducing
>>> another tool. People will not stop using (that feature of) JOSM. That is
>>> why I think, if you think you detected a problematic issue there in that
>>> editor, it should be discussed in a separate topic.
>>>
>>> On 17/10/2017 00:57, Yuri Astrakhan wrote:
>>> > Michael, I can only judge by my own experience adding validation
>>> autofix
>>> > rules - I added a number of Wikipedia tag auto cleanups to JOSM, and
>>> > they were reviewed by one or two JOSM developers and merged, probably
>>> > because they were deemed benign.  I don't know about the other rules,
>>> > but I suspect many of them also went this route.  Should have they been
>>> > discussed more widely? I don't know, but that question is complicated,
>>> > just like "what is a local community?" question. What a few devs may
>>> see
>>> > as benign, others may say needs a discussion, right?
>>> >
>>> > Mass editing is a different matter.  We consider mass editing when one
>>> > person goes out to fix something everywhere in the world.  But when we
>>> > provide a tool that automatically fixes something that you are looking
>>> > at, we don't view it as such.  Or at least we don't view it when it
>>> > happens as part of JOSM, but we do when it happens in my new tool. Of
>>> > course there is an important difference - JOSM doesn't guide you
>>> towards
>>> > those cases.
>>> >
>>> > I think massive "by-the-way" fixing is far worse than the targeted fix
>>> > of a single issue.
>>> >
>>> > When you want to fix a single issue in many places, you become a
>>> subject
>>> > matter expert.  You know all about that change, how it interacts with
>>> > other tags, what to wat

Re: [OSM-talk] New OSM Quick-Fix service

2017-10-17 Thread Yuri Astrakhan
Well, you kind of can fix one with the other - by introducing a better tool
and disabling some of the autofixes in JOSM (very easy to do).  A more
complex approach would clearly require a separate topic(s) and a
substantial dev involvement.

P.S. No, https://master.apis.dev.openstreetmap.org/ doesn't have any real
data (it shows maps from live servers, but editing shows just a few
objects).

On Tue, Oct 17, 2017 at 3:36 AM, Tobias Zwick <o...@westnordost.de> wrote:

> I get your point, especially regarding the appliance of the JOSM
> fix-button as a "by-the-way" fixing.
>
> Though, you can't fix possible issues with of one tool by introducing
> another tool. People will not stop using (that feature of) JOSM. That is
> why I think, if you think you detected a problematic issue there in that
> editor, it should be discussed in a separate topic.
>
> On 17/10/2017 00:57, Yuri Astrakhan wrote:
> > Michael, I can only judge by my own experience adding validation autofix
> > rules - I added a number of Wikipedia tag auto cleanups to JOSM, and
> > they were reviewed by one or two JOSM developers and merged, probably
> > because they were deemed benign.  I don't know about the other rules,
> > but I suspect many of them also went this route.  Should have they been
> > discussed more widely? I don't know, but that question is complicated,
> > just like "what is a local community?" question. What a few devs may see
> > as benign, others may say needs a discussion, right?
> >
> > Mass editing is a different matter.  We consider mass editing when one
> > person goes out to fix something everywhere in the world.  But when we
> > provide a tool that automatically fixes something that you are looking
> > at, we don't view it as such.  Or at least we don't view it when it
> > happens as part of JOSM, but we do when it happens in my new tool. Of
> > course there is an important difference - JOSM doesn't guide you towards
> > those cases.
> >
> > I think massive "by-the-way" fixing is far worse than the targeted fix
> > of a single issue.
> >
> > When you want to fix a single issue in many places, you become a subject
> > matter expert.  You know all about that change, how it interacts with
> > other tags, what to watch out for, how to handle bad values, etc.  For
> > example, when fixing wikipedia tags, you would see the types of mistakes
> > people make, wrong prefixes people use, incorrect url encodings, hash
> > tags in urls, incorrect multiple values, ... .When you simply click
> > "fix" because JOSM validator tells you it can fix it automatically, you
> > don't have that knowledge, so it effectively becomes a distributed
> > mechanical edit without the "reject" capability.  My tool tries to
> > address this - to build domain experts in a narrow field, and let those
> > experts review changes one by one. I do not discount the value of local
> > knowledge, but it is not a panacea - you must be both to make
> > intelligent choices, and in some cases, the domain knowledge is more
> > important than the knowledge of a specific locale.
> >
> > On Mon, Oct 16, 2017 at 4:00 PM, Michael Reichert
> > <osm...@michreichert.de <mailto:osm...@michreichert.de>> wrote:
> >
> > Hi Yuri,
> >
> > Am 16.10.2017 um 16:02 schrieb Yuri Astrakhan:
> > > Rory, most of those queries were copied from the current JOSM
> validator
> > > autofixes.  I don't think they were discussed, but they might have
> been
> > > mass applied without much thought by all sorts of editors.
> >
> > Could you please give examples for (a) the mass appliance of these
> rules
> > and (b) rules which have not been discussed but should have been
> > discussed?
> > > There are two ways to use the tool - you can write your own query,
> run it,
> > > and fix whatever it is you want to fix. That's the power user mode
> -
> > > anything goes, no different from JOSM or Level0. And there is
> another one -
> > > where you go to osm wiki, read the instructions, find the task you
> may want
> > > to work on, and go at it.   The community reviews wiki content,
> tags
> > > different pages with different explanation or warning boxes, etc.
> The
> > > discussion could still be on the forum, or here, or in IRC, 
> >
> > Just for future readers: IRC and Telegram channels are no replacement
> > for a mailing list or a forum with a public readable archive where
> you
> > can look up the discussions years la

Re: [OSM-talk] New OSM Quick-Fix service

2017-10-13 Thread Yuri Astrakhan
Simon, thanks for the constructive criticism, as it allows improvements
rather than aggravation. I propose that "rejections" are saved as a new
tag, for example "_autoreject".  In a way, this is very similar to the
"nobot" proposal - users reject a specific bot by hand.

_autoreject will store a semicolon-separated multivalue tag.  A query will
contain some "ID", e.g. "amenity-sanatorium", and that ID will be added to
the _autoreject whenever user clicks "reject suggestion" button.

Benefits:
* use existing tools to analyse, search, and edit this tag, without
creating anything new
* we can use it inside the queries themselves to say "only suggest to fix X
if the users have rejected Y", or if someone creates a bad query and most
values are rejected, we can easily find them and clean them up
* very easy to implement, few chances for bugs, no chances to loose
rejection data by accident
* other tools can also use this field to store rejections, e.g. mapRoulette
or Omose.
* Query authors can easily search for it to see why they showed up in the
query result, and fix the original query

The biggest problem is the tag name, any suggestions?
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] New OSM Quick-Fix service

2017-10-13 Thread Yuri Astrakhan
I would like to introduce a new quick-fix editing service.  It allows users
to generate a list of editing suggestions using a query, review each
suggestion one by one, and click "Save" on each change if they think it's a
good edit.

For example, RU community wants to convert  amenity=sanatorium  ->
leisure=resort + resort=sanatorium.  Clicking on a dot shows a popup with
the suggested edit. If you think the edit is correct, simply click Save.
Try it:  http://tinyurl.com/y8mzvk84

I have started a Quick fixes wiki page, where we can share and discuss
quick fix ideas.
* Quick fixes 
* Documentation


This is a very new project, and bugs are likely. Please go slowly, and
check the resulting edits. Let me know if you find any problems. Your
technical expertise is always welcome, see the code at
https://github.com/nyurik/wikidata-query-gui  The service has adapted some
code from the Osmose project (thanks!)

TODO:
* Allow multiple edits per one change set
* Show objects instead of the dots
* Allow users to change comment before saving
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-14 Thread Yuri Astrakhan
** UPDATE: **

The service now supports "reject" button.  To use it, your query must
contain "  #queryId:...  "  comment.  By default, when a user rejects
something, a tag "_autoreject=id" is created. An object can have multiple
rejected IDs. If the current query was previously rejected, user will no
longer be able to edit the object with the same query.

Optionally, query may specify a different rejection tag with
 "  #rejectTag:...  ", instead of using the default "_autoreject".  I am
still hoping for a better default name.

Both #rejectTag and #queryId values must consist of only the Latin
characters, digits, and underscores.

Additionally, the tool no longer allows editing above zoom 16.

Thanks!


On Sat, Oct 14, 2017 at 12:34 AM Yuri Astrakhan <yuriastrak...@gmail.com>
wrote:

> Simon, thanks for the constructive criticism, as it allows improvements
> rather than aggravation. I propose that "rejections" are saved as a new
> tag, for example "_autoreject".  In a way, this is very similar to the
> "nobot" proposal - users reject a specific bot by hand.
>
> _autoreject will store a semicolon-separated multivalue tag.  A query will
> contain some "ID", e.g. "amenity-sanatorium", and that ID will be added to
> the _autoreject whenever user clicks "reject suggestion" button.
>
> Benefits:
> * use existing tools to analyse, search, and edit this tag, without
> creating anything new
> * we can use it inside the queries themselves to say "only suggest to fix
> X if the users have rejected Y", or if someone creates a bad query and most
> values are rejected, we can easily find them and clean them up
> * very easy to implement, few chances for bugs, no chances to loose
> rejection data by accident
> * other tools can also use this field to store rejections, e.g.
> mapRoulette or Omose.
> * Query authors can easily search for it to see why they showed up in the
> query result, and fix the original query
>
> The biggest problem is the tag name, any suggestions?
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-14 Thread Yuri Astrakhan
Hi Michael,

Currently, the tool creates two changeset tags:

  created_by=OSM+Wikidata 0.1
  comment=(generated by the user's query)

What other tags should I add?  I know the name is not very good, and I
should finalize with it soon to reduce confusion.  Also, I just bumped the
version to 0.2, and added the ability to do reject tag (as described in my
prev email).

Thanks!

On Sat, Oct 14, 2017 at 3:45 AM Michael Reichert <osm...@michreichert.de>
wrote:

> Hi Yuri,
>
> Am 2017-10-13 um 23:25 schrieb Yuri Astrakhan:
> > I would like to introduce a new quick-fix editing service.  It allows
> users
> > to generate a list of editing suggestions using a query, review each
> > suggestion one by one, and click "Save" on each change if they think
> it's a
> > good edit.
> >
> > For example, RU community wants to convert  amenity=sanatorium  ->
> > leisure=resort + resort=sanatorium.  Clicking on a dot shows a popup with
> > the suggested edit. If you think the edit is correct, simply click Save.
> > Try it:  http://tinyurl.com/y8mzvk84
> >
> > I have started a Quick fixes wiki page, where we can share and discuss
> > quick fix ideas.
> > * Quick fixes <https://wiki.openstreetmap.org/wiki/Quick_fixes>
> > * Documentation
> > <
> https://wiki.openstreetmap.org/wiki/Wikidata%2BOSM_SPARQL_query_service#Quick-fix_Editor
> >
>
> Which created_by=* tag does your editor set on the changesets?
>
> Best regards
>
> Michael
>
>
>
>
> --
> Per E-Mail kommuniziere ich bevorzugt GPG-verschlüsselt. (Mailinglisten
> ausgenommen)
> I prefer GPG encryption of emails. (does not apply on mailing lists)
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-14 Thread Yuri Astrakhan
Andy, the query works fine, you probably hit it during the update.  Also, I
just updated some queries at Quick_fixes
<https://wiki.openstreetmap.org/wiki/Quick_fixes> - they now able to record
when change should be rejected.  The IP address has been unchanged ever
since I announced the OSM+Wikidata service.

Christoph, I looked around Osmose and MapRoulette, and I don't see any such
warnings . Could you elaborate how you would like these kinds of tools to
promote good editing practices? Any UI ideas? I'll be happy to improve our
tools on making sure they meet community expectations.

On Sat, Oct 14, 2017 at 6:09 AM Christoph Hormann <o...@imagico.de> wrote:

> On Friday 13 October 2017, Yuri Astrakhan wrote:
> > I would like to introduce a new quick-fix editing service.  It allows
> > users to generate a list of editing suggestions using a query, review
> > each suggestion one by one, and click "Save" on each change if they
> > think it's a good edit.
>
> This is a tool to perform automated edits as per the automated edits
> policy.  A resposible developer of such a tool should inform its users
> that making automated edits comes with certain requirements and that
> not following these rules can result in changes being reverted and user
> accounts being blocked.
>
> --
> Christoph Hormann
> http://www.imagico.de/
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
Michael, I can only judge by my own experience adding validation autofix
rules - I added a number of Wikipedia tag auto cleanups to JOSM, and they
were reviewed by one or two JOSM developers and merged, probably because
they were deemed benign.  I don't know about the other rules, but I suspect
many of them also went this route.  Should have they been discussed more
widely? I don't know, but that question is complicated, just like "what is
a local community?" question. What a few devs may see as benign, others may
say needs a discussion, right?

Mass editing is a different matter.  We consider mass editing when one
person goes out to fix something everywhere in the world.  But when we
provide a tool that automatically fixes something that you are looking at,
we don't view it as such.  Or at least we don't view it when it happens as
part of JOSM, but we do when it happens in my new tool. Of course there is
an important difference - JOSM doesn't guide you towards those cases.

I think massive "by-the-way" fixing is far worse than the targeted fix of a
single issue.

When you want to fix a single issue in many places, you become a subject
matter expert.  You know all about that change, how it interacts with other
tags, what to watch out for, how to handle bad values, etc.  For example,
when fixing wikipedia tags, you would see the types of mistakes people
make, wrong prefixes people use, incorrect url encodings, hash tags in
urls, incorrect multiple values, ... .When you simply click "fix"
because JOSM validator tells you it can fix it automatically, you don't
have that knowledge, so it effectively becomes a distributed mechanical
edit without the "reject" capability.  My tool tries to address this - to
build domain experts in a narrow field, and let those experts review
changes one by one. I do not discount the value of local knowledge, but it
is not a panacea - you must be both to make intelligent choices, and in
some cases, the domain knowledge is more important than the knowledge of a
specific locale.

On Mon, Oct 16, 2017 at 4:00 PM, Michael Reichert <osm...@michreichert.de>
wrote:

> Hi Yuri,
>
> Am 16.10.2017 um 16:02 schrieb Yuri Astrakhan:
> > Rory, most of those queries were copied from the current JOSM validator
> > autofixes.  I don't think they were discussed, but they might have been
> > mass applied without much thought by all sorts of editors.
>
> Could you please give examples for (a) the mass appliance of these rules
> and (b) rules which have not been discussed but should have been discussed?
> > There are two ways to use the tool - you can write your own query, run
> it,
> > and fix whatever it is you want to fix. That's the power user mode -
> > anything goes, no different from JOSM or Level0. And there is another
> one -
> > where you go to osm wiki, read the instructions, find the task you may
> want
> > to work on, and go at it.   The community reviews wiki content, tags
> > different pages with different explanation or warning boxes, etc. The
> > discussion could still be on the forum, or here, or in IRC, 
>
> Just for future readers: IRC and Telegram channels are no replacement
> for a mailing list or a forum with a public readable archive where you
> can look up the discussions years later.
>
> Best regards
>
> Michael
>
>
>
> --
> Per E-Mail kommuniziere ich bevorzugt GPG-verschlüsselt. (Mailinglisten
> ausgenommen)
> I prefer GPG encryption of emails. (does not apply on mailing lists)
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
Richard, thanks for the link and your analysis.

Eric Raymond once said that "Every good work of software starts by
scratching a developer's personal itch."  Judging by how many different
individuals have created various challenges and fixers, there is clearly a
big irritation - highly messy, unclean data.  A corollary is that the
existing tools do not address the entire scope of the problem, thus new
tools keep being created.  BTW, thanks for listing a few tools I haven't
heard about.

A number of people here feel that we cannot trust our users to be diligent.
While I don't like stigmatizing it in such way, some people some times do
make bad edits. The fundamental question we should keep in mind is:  "does
the benefit outweighs the risk".  Or more precisely  - which approach
produces better result. If we do nothing, the data becomes less consistent,
and sporadic unorganized efforts may hinder more than help. If we do bot
editing, corner cases and bugs may spoil valid data. If a challenge
requires manual edit, we have a high risk of typos - people are not very
good at performing the same edit over and over and not make mistakes. If we
do distributed accept/reject, some people, in theory, may be tempted to be
trigger happy.  In each case, the balance is fairly hard to reach.

In distributed editing, one way to solve the "auto-clicking" is to use
"multi-reviewer" approach - require two people to agree on an edit before
it happens.  I can fairly easily add that capability.  This way, an expert
editor may use the tool for direct editing (power mode), but the published
challenges will require two person agreement, unless the community decides
that a specific query is acceptable with just one.  I do not want to make
that decision for each community, as different cases require different
approaches.

What do you think?  Would that address the most pressing concern?

On Mon, Oct 16, 2017 at 10:13 AM, Richard Fairhurst <rich...@systemed.net>
wrote:

> Yuri Astrakhan wrote:
> > For example, RU community wants to convert  amenity=sanatorium
> > -> leisure=resort + resort=sanatorium.  Clicking on a dot shows a
> > popup with the suggested edit. If you think the edit is correct, simply
> > click Save.
>
> I've been a bit loth to get involved with this one but I do share the
> general worry.
>
> Editor authors have a general responsibility to encourage good editing
> behaviour in their UI design. It isn't quite as simple as "every tool can
> be
> used for good and bad things": the developer should design the tool to
> encourage the good and discourage (or prevent) the bad. The developers of
> JOSM and, particularly, iD have long been exemplary in this regard.
>
> This new tool can certainly be used for good, and there are use cases for
> which it is ideal, but it's also very easy to misuse. My biggest concern is
> that since it's decoupled from an editing environment, the natural tendency
> is just to click 'Change', 'Change', 'Change' rather than reviewing and
> manually making the changes. (We've seen this behaviour in several
> "challenges" in the past, such as the dupe nodes drive.) OSM is a
> collection
> of human knowledge; this workflow goes too far in removing the human from
> the equation.
>
> As an alternative, could I encourage you to look at something tentative I
> did the other year for that relic of an editor, Potlatch 2?
>
>https://www.openstreetmap.org/user/Richard/diary/28267
>
> This allows a user to navigate instantly between instances of a "challenge"
> within the editor, while benefiting from an external data source to define
> that challenge. The P2 implementation is fairly simple (there's no
> "Resolved" button to feed back to that external source, for example) but
> demonstrates the concept.
>
> If you were to build something along these lines into JOSM or iD, following
> the traditional MapRoulette-like approach of asking users to make the
> change
> rather than automating it, I think you'd get the benefits you're seeking to
> achieve without the potential damage.
>
> Richard
>
>
>
> --
> Sent from: http://gis.19327.n8.nabble.com/General-Discussion-f5171242.html
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
I agree that the tool requires some additional work.  It seems almost all
of the criticism has been directed at the hypothetical "community clicking
rampage" - where the query is stored on a wiki, and some user runs it
thoughtlessly. At the same time, several skilled users have expressed their
desire to use it for their own work.  Hence, as a good compromise between
the two, how about I disable the "embed edit".  If the query is executed
from a link, without the query editor mode, users can only view results.
But in the power mode, the users can still use the tool to write a query
they need, test and edit things as they need.   So its ok to use it as a
power editor (e.g. JOSM or Level0), but not as mass contribution.

In the mean time, I will add the "two person approval required", which
should alleviate expressed concern.  Should be ready fairly soon.

On Mon, Oct 16, 2017 at 8:24 PM, Frederik Ramm  wrote:

> Hi,
>
> On 10/16/2017 11:10 PM, Tobias Zwick wrote:
> > Anyway, generally, with everyone raising the alarm about this tool, it
> > would be a friendly gesture to either take the tool offline for now or
> > set it to read-only mode
>
> Or have it run on the dev API.
>
> > So then, the solution is simple: Make the quick-fix tool to only record
> > confirms and rejects into a separate database and let the tool not make
> > actual edits to OSM. The confirms and most importantly the rejects are
> > shown on the tool's interface, so the problems in the automatic query
> > can be addressed.
>
> The "Kort game" has followed a similar approach. When they started, they
> first only recorded things internally and also had more than one user
> confirm each edit. After test-driving that for a while and assessing the
> quality of results, they started a discussion about if and how the Kort
> results could automatically be applied to OSM. It was a slow process but
> one that went to great lengths to respect how OSM works and not to
> "disrupt" anything. The makers of "Kort" probably spent as much time on
> making their tool acceptable to the community as they spent on
> developing it - but that's what you have to do when you deal with humans
> and not just an API.
>
> Bye
> Frederik
>
> --
> Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] New OSM Quick-Fix service

2017-10-16 Thread Yuri Astrakhan
Rory, most of those queries were copied from the current JOSM validator
autofixes.  I don't think they were discussed, but they might have been
mass applied without much thought by all sorts of editors.  What's worse,
there is no way to track those autofixes. The wiki page has a huge warning
box at the top, which should stop accidental misuse.  At this point, there
is no officially agreed wiki page with the "allowed" queries.  Once the
tool matures a bit, we can create a place for the community approved
tasks.  My proposal - place queries for evaluation on a wiki page under a
warning box. Let community review them. Then we can move them one by one to
the "green" page.

There are two ways to use the tool - you can write your own query, run it,
and fix whatever it is you want to fix. That's the power user mode -
anything goes, no different from JOSM or Level0. And there is another one -
where you go to osm wiki, read the instructions, find the task you may want
to work on, and go at it.   The community reviews wiki content, tags
different pages with different explanation or warning boxes, etc. The
discussion could still be on the forum, or here, or in IRC,  The tool
cannot automate the review process - if someone wants to break the rules,
they can still write whatever query they want and run it.  Or use JOSM or
Level0.

Just like Éric Gillet said - every tool can be used for good and bad
things. Having the right explanations on the wiki will solve 80% of the
problems.

P.S. You can star any wiki page, and it will email you when the page
changes. Just like a forum.


On Mon, Oct 16, 2017 at 8:42 AM, Rory McCann <r...@technomancy.org> wrote:

> On 16/10/17 14:02, Yuri Astrakhan wrote:
>
>> Rory, thanks, and that's why I think it is a bad idea to do bot edits
>> without first running it through my tool.  If we do a mass edit, we have to
>> go through a very lengthy community consensus study, which might still miss
>> things. Then the bot developer might still make an error that is not likely
>> to be caught for quiet some time, until it is very hard to revert. On the
>> other hand, if a query is made, reviewed by community, and later many
>> people try going through it, accepting and rejecting changes, we will know
>> if we caught all the corner cases like the one you just gave. If noone has
>> rejected anything for a long time, a bot can simply pick up the query and
>> finish running it.  Much safer.
>>
>
> I don't see how your tool will stop (say) an American making this sort
> of assumption, and edit? How should community review happen in your
> tool? I'm not going to monitor your wiki page. Automated edits should be
> discussed on the talk or imports mailing list. But I don't think you've
> done that for the queries you've done already, and I'm not sure how your
> programme requires that.
>
>
> As for "community consensus" - TBH, very hard to define.
>>
>
> Agreed.
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing 850+ disambiguation errors

2017-09-09 Thread Yuri Astrakhan
Fabrizio, the easiest way to fix both "wikidata" and "wikipedia" tags is to
use  iD editor's  "wikipedia" field (at the top), but not the "tag" field
(list at the bottom).  This way, selecting Wikipedia value from the
dropdown automatically corrects wikidata as well. If using JOSM, use
Wikipedia plugin - there is a "fetch Wikidata" command.  Thanks!

Oleksiy, this query is slightly more optimal http://tinyurl.com/ydcv76ph
(since you do filtering by Wikipedia tag, no need to make it optional).
You can generate those links by clicking the bottom icon on the left.

Here's a query to show OSM nodes with Wikidata, where Wikidata has no Image
property. Shows on a map:   http://tinyurl.com/yclpzrua .

On Sat, Sep 9, 2017 at 11:38 AM, Oleksiy Muzalyev <
oleksiy.muzal...@bluewin.ch> wrote:

> Dear Fabrizio,
>
> You corrected the "wikipedia" tag all right for this church
> https://www.openstreetmap.org/way/175759917
> but the correct "wikidata" tag should be this one:
> https://www.wikidata.org/wiki/Q3671080
>
> but not this one: https://www.wikidata.org/wiki/Q1905745
>
> You can find the correct wikidata page link from the Wikipedia article of
> this church: https://it.wikipedia.org/wiki/it:Chiesa%20di%20San%
> 20Martino%20(Siracusa)?uselang=en
> on the left side of the page there is the link: Wikidata item.
>
> I do not know who added wikidata url of a disambiguation page to the OSM
> map. Perhaps, it was done by some script automatically.
>
> Best regards,
> Oleksiy
>
>
> On 09.09.17 16:52, Fabrizio Carrai wrote:
>
> Solved the 4 issues of it.wikipedia.org, now only 3 are shown by the
> query.
> Can someone check ?
>
> Thanks
>
> ---
> FabC
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing 850+ disambiguation errors

2017-09-09 Thread Yuri Astrakhan
Per request on my user page by Mateusz Konieczny, I wrote a query that
shows "featured" geo-tagged Wikipedia articles without an OSM object. This
way community can prioritize to work on the more popular locations first.
Instead of "featured", query can also use the article popularity (pageview
counts).

Also, I explain how to access this service via API.  I really hope JOSM and
MapRoulette will be able to support queries natively.  Any JOSM gurus?

Request - https://wiki.openstreetmap.org/wiki/User_talk:Yurik

P.S. Thanks Shu!

On Sat, Sep 9, 2017 at 8:48 PM, Shu Higashi <higa...@gmail.com> wrote:

> Hi Yuri,
>
> Thanks for offering the tool for checking.
> I corrected 5 of "ja" ones.
>
> Shu Higashi
>
> 2017-09-10 2:58 GMT+09:00, Yuri Astrakhan <yuriastrak...@gmail.com>:
> > Fabrizio, the easiest way to fix both "wikidata" and "wikipedia" tags is
> to
> > use  iD editor's  "wikipedia" field (at the top), but not the "tag" field
> > (list at the bottom).  This way, selecting Wikipedia value from the
> > dropdown automatically corrects wikidata as well. If using JOSM, use
> > Wikipedia plugin - there is a "fetch Wikidata" command.  Thanks!
> >
> > Oleksiy, this query is slightly more optimal http://tinyurl.com/ydcv76ph
> > (since you do filtering by Wikipedia tag, no need to make it optional).
> > You can generate those links by clicking the bottom icon on the left.
> >
> > Here's a query to show OSM nodes with Wikidata, where Wikidata has no
> Image
> > property. Shows on a map:   http://tinyurl.com/yclpzrua .
> >
> > On Sat, Sep 9, 2017 at 11:38 AM, Oleksiy Muzalyev <
> > oleksiy.muzal...@bluewin.ch> wrote:
> >
> >> Dear Fabrizio,
> >>
> >> You corrected the "wikipedia" tag all right for this church
> >> https://www.openstreetmap.org/way/175759917
> >> but the correct "wikidata" tag should be this one:
> >> https://www.wikidata.org/wiki/Q3671080
> >>
> >> but not this one: https://www.wikidata.org/wiki/Q1905745
> >>
> >> You can find the correct wikidata page link from the Wikipedia article
> of
> >> this church: https://it.wikipedia.org/wiki/it:Chiesa%20di%20San%
> >> 20Martino%20(Siracusa)?uselang=en
> >> on the left side of the page there is the link: Wikidata item.
> >>
> >> I do not know who added wikidata url of a disambiguation page to the OSM
> >> map. Perhaps, it was done by some script automatically.
> >>
> >> Best regards,
> >> Oleksiy
> >>
> >>
> >> On 09.09.17 16:52, Fabrizio Carrai wrote:
> >>
> >> Solved the 4 issues of it.wikipedia.org, now only 3 are shown by the
> >> query.
> >> Can someone check ?
> >>
> >> Thanks
> >>
> >> ---
> >> FabC
> >>
> >>
> >
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing 850+ disambiguation errors

2017-09-10 Thread Yuri Astrakhan
Thanks!  Worry not, I just added more for fixing, by extracting them from
Wikipedia tag using the "fetch wikidata" JOSM tool.  And there is 50K more
to add, judging by the difference in key:wikipedia vs key:wikidata in
taginfo - i'm sure there are many more errors hiding behind the hard to
process wikipedia tag.

Also, if you have some time, please take a look at the other quality
control queries in the examples.  Again, thanks for helping!!!

On Sun, Sep 10, 2017 at 3:51 AM, Oleksiy Muzalyev <
oleksiy.muzal...@bluewin.ch> wrote:

> Good morning Yuri,
>
> Thank you!
>
> I've corrected all the errors for the "uk", "ru" versions of the
> Wikipedia, for the "it" version I corrected the remaining erroneous
> wikidata tags and there are no more errors for the "it" either. I also
> corrected some errors for the "fr" and "de" versions.
>
> I will remind that to display errors one goes to the page:
> http://88.99.164.208/wikidata/
> then paste and run the following query in the upper right box:
>
> SELECT ?osmId ?wdLabel ?wd ?wpTag WHERE {
>
>   ?osmId osmt:wikidata ?wd ;
>  osmt:wikipedia ?wpTag .
>
>   ?wd wdt:P31/wdt:P279* wd:Q4167410 .
>
>   FILTER( STRSTARTS(STR(?wpTag), 'https://fr.wikipedia'))
>   SERVICE wikibase:label { bd:serviceParam wikibase:language "fr" . }
> }
> LIMIT 100
>
> # (replace "fr" in the above query on "de", "ru", or other language
> version code of the Wikipedia).
>
> Best regards,
> Oleksiy
>
>
>
>
> On 09.09.17 19:58, Yuri Astrakhan wrote:
>
> Fabrizio, the easiest way to fix both "wikidata" and "wikipedia" tags is
> to use  iD editor's  "wikipedia" field (at the top), but not the "tag"
> field (list at the bottom).  This way, selecting Wikipedia value from the
> dropdown automatically corrects wikidata as well. If using JOSM, use
> Wikipedia plugin - there is a "fetch Wikidata" command.  Thanks!
>
> Oleksiy, this query is slightly more optimal http://tinyurl.com/ydcv76ph
> (since you do filtering by Wikipedia tag, no need to make it optional).
> You can generate those links by clicking the bottom icon on the left.
>
> Here's a query to show OSM nodes with Wikidata, where Wikidata has no
> Image property. Shows on a map:   http://tinyurl.com/yclpzrua .
>
> On Sat, Sep 9, 2017 at 11:38 AM, Oleksiy Muzalyev <
> oleksiy.muzal...@bluewin.ch> wrote:
>
>> Dear Fabrizio,
>>
>> You corrected the "wikipedia" tag all right for this church
>> https://www.openstreetmap.org/way/175759917
>> but the correct "wikidata" tag should be this one:
>> https://www.wikidata.org/wiki/Q3671080
>>
>> but not this one: https://www.wikidata.org/wiki/Q1905745
>>
>> You can find the correct wikidata page link from the Wikipedia article of
>> this church: https://it.wikipedia.org/wiki/it:Chiesa%20di%20San%20Martino
>> %20(Siracusa)?uselang=en
>> <https://it.wikipedia.org/wiki/it:Chiesa%20di%20San%20Martino%20%28Siracusa%29?uselang=en>
>> on the left side of the page there is the link: Wikidata item.
>>
>> I do not know who added wikidata url of a disambiguation page to the OSM
>> map. Perhaps, it was done by some script automatically.
>>
>> Best regards,
>> Oleksiy
>>
>>
>> On 09.09.17 16:52, Fabrizio Carrai wrote:
>>
>> Solved the 4 issues of it.wikipedia.org, now only 3 are shown by the
>> query.
>> Can someone check ?
>>
>> Thanks
>>
>> ---
>> FabC
>>
>>
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Fixing 850+ disambiguation errors

2017-09-08 Thread Yuri Astrakhan
Hello mappers! Could anyone help with over 800 OSM objects that are
pointing to Wikipedia disambiguation pages?  Especially Polish community -
588. You can get the currently broken ones by running the query below (you
may want to modify it a bit to list more relevant objects).  If you feel
brave, there are also 25,000+ objects with incorrect Wikipedia tags.
Thanks!

Main: OSM disambig query

Direct: http://tinyurl.com/ya4lhmt8

Bad Wikipedia tag - could be because non-existent title, or a redirect, or
because Wikidata item hasn't been created yet: query


The current statistics for Disambig pages, per corresponding Wikipedia site:
// generated with http://tinyurl.com/yarkdy53

https://pl.wikipedia.org/ 588
https://en.wikipedia.org/ 74
unknown 36
https://de.wikipedia.org/ 24
https://sv.wikipedia.org/ 18
https://cs.wikipedia.org/ 17
https://zh.wikipedia.org/ 8
https://uk.wikipedia.org/ 7
https://vi.wikipedia.org/ 7
https://ru.wikipedia.org/ 5
https://ja.wikipedia.org/ 5
...
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing 850+ disambiguation errors

2017-09-10 Thread Yuri Astrakhan
Now all disambig-broken points on a map. Click the point to fix it.

http://tinyurl.com/ya6htp9f


On Sun, Sep 10, 2017 at 4:04 AM, Yuri Astrakhan <yuriastrak...@gmail.com>
wrote:

> Thanks!  Worry not, I just added more for fixing, by extracting them from
> Wikipedia tag using the "fetch wikidata" JOSM tool.  And there is 50K more
> to add, judging by the difference in key:wikipedia vs key:wikidata in
> taginfo - i'm sure there are many more errors hiding behind the hard to
> process wikipedia tag.
>
> Also, if you have some time, please take a look at the other quality
> control queries in the examples.  Again, thanks for helping!!!
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Fixing wiki* -> brand:wiki*

2017-09-26 Thread Yuri Astrakhan
Here is a query that finds all wikidata IDs frequently used in
"brand:wikidata", and shows OSM objects whose "wikidata" points to the
same. I would like to replace all such wikidata/wikipedia tags with the
corresponding brand:wikidata/brand:wikipedia.  Most of them are in India,
but there are some in Europe and other places.  This query can be used
directly from JOSM as well.

http://tinyurl.com/y72afjpy

BTW, this type of queries might be good for maproulette challenges once
they can work more like osmose.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing OSM wikipedia redirects

2017-09-25 Thread Yuri Astrakhan
You do have a valid point about getting the local community exposure to
Wikidata.  But I see no contradiction between that and my proposal, because
I think it would be very easy to come up with countless Wikipedia/Wikidata
cleanup tasks that require human attention. There is always be plenty of
work.  After my program runs, there will be thousands of items that could
not be resolved as easily. For example, there will be tons of cases when
wikipedia and wikidata point to different entities. Some of them are legit
- e.g. island (wikipedia) vs administrative area (wikidata).   Redirect
resolution would not introduce communities to wikidata, but rather teach
community how to mindlessly click "accept", and I would much rather avoid
that - as this might result in bigger problems when a real decision needs
to be made.

On Tue, Sep 26, 2017 at 1:04 AM, Marc Gemis <marc.ge...@gmail.com> wrote:

> By using Osmose, it would be possible to involve the local
> communities. People would learn about Wikidata, and might start adding
> them to other objects as well. They might even start contributing to
> Wikidata as well.
> By just running your program, you would only fix a small number of
> entries and nobody would know, nobody would bother about them.
>
> I have the feeling that a program can fix some errors in a short
> period, but doesn't bring anything else. Allowing people to fix
> trivial problems, allow them to get familiar with the data, they will
> take some form of ownership and maintain the data and that is more
> beneficial in the long term than an automated quick fix now.
>
> m.
>
> On Tue, Sep 26, 2017 at 5:53 AM, Yuri Astrakhan <yuriastrak...@gmail.com>
> wrote:
> > According to Martijn (of MapRoulette fame), there is no way a challenge
> can
> > link to object IDs. MapRoulette can only highlight location. Nor can I
> > provide a proposed fix, which means someone would have to manually find
> the
> > broken object, navigate to Wikipedia, copy/paste the title, and save the
> > object.  I guesstimate 1 minute per object on average... that's nearly
> 700
> > hours of community time - a huge waste of human brain power that could be
> > spent on a much more challenging and less automatable tasks.
> >
> > Osmose might be a good alternative, and might even lower the total
> number of
> > hours required, but still - would that significantly benefit the project?
> > These tags are just a tiny arbitrary subset of one million
> wikipedia-tagged
> > objects.  Verifying just them by hand seems like a waste of human
> > intelligence. Instead, we can run queries to produce knowingly bad
> objects
> > and let community fix those. I hope we can let machines do mindless
> tasks,
> > and let humans do decision making.  This would improve contributors
> morale,
> > instead of making them feel like robots :)
> >
> > Clarifying: the OSM objects already point to those pages via redirect.
> The
> > redirect information is only stored in Wikipedia.
> >
> > On Mon, Sep 25, 2017 at 11:18 PM, Marc Gemis <marc.ge...@gmail.com>
> wrote:
> >>
> >> or via Osmose ?
> >>
> >> On Tue, Sep 26, 2017 at 5:16 AM, Marc Gemis <marc.ge...@gmail.com>
> wrote:
> >> > what about a Maproulette task ?
> >> >
> >> > On Tue, Sep 26, 2017 at 5:11 AM, Yuri Astrakhan
> >> > <yuriastrak...@gmail.com> wrote:
> >> >> At the moment, there are nearly 40,000 OSM objects whose wikipedia
> tag
> >> >> does
> >> >> not match their wikidata tag. Most of them are Wikipedia redirects,
> >> >> whose
> >> >> target is the right wikipedia article. If we are not ready to abandon
> >> >> wikipedia tags just yet (I don't think we should ATM), I think we
> >> >> should fix
> >> >> them.  Fixing them by hand seems like a huge waste of the community
> >> >> time,
> >> >> when it can be semi-automated.
> >> >>
> >> >> I propose that a small program, possibly a plugin to JOSM, would
> change
> >> >> wikipedia tags to point to the target article instead of the
> redirect.
> >> >>
> >> >> Thoughts?
> >> >>
> >> >> ___
> >> >> talk mailing list
> >> >> talk@openstreetmap.org
> >> >> https://lists.openstreetmap.org/listinfo/talk
> >> >>
> >
> >
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-25 Thread Yuri Astrakhan
Since this thread had not received any new discussion in the past 4 days, I
assumed all points were answered and proceeded as planned, per mechanical
edit policy. Yet, after I have added all the nodes and moved on to
relations, I have been blocked by Andy Townsend with the following message.
I believe Andy is acting in best interest of the project, yet might have
missed or misread this discussion.  Also, the block is such that I am no
longer able to even reply on the changesets to the raised questions, so
moving it here.  I believe I acted in good faith according to the
mechanical edit policy - discussed with the community, and proceeded.

A few interesting semi-relevant statistics so far:  the number of
discovered links to disambig pages is now back to over 800, even without
100k+ untaged ways. And there are almost 38,000 osm objects where wikipedia
tag does not correspond with wikidata tag. The number is very high, but
fixing them should be semi-automated, as most of them are redirects. TBD.

Here's Andy's message, with my inlined replies. I think that almost all of
the raised points have been raised and answered in our previous discussion,
but I feel it is my responsibility to present them again.

You're conducting an import of known bad data (your own changeset comments
> say "Further cleanup will be done using...").
>

Per previous description, the existing data is already bad, and I am simply
making it possible to identify it, after discussing it on this thread.


> You are wilfully ignoring the feedback that you're receiving now and have
> received in the past. A lot of issues have been raised about the quality of
> your edits - see
> http://resultmaps.neis-one.org/osm-discussion-comments?uid=339581 . In
> many cases you seem to agree that you're adding rubbish, and yet you
> continue.
>
You seem to be suggesting (in
> https://lists.openstreetmap.org/pipermail/talk/2017-September/078767.html
> ) that "the community" clean up your mess. This is not the way that
> OpenStreetMap works - if an individual is adding data to it (especially
> large quantities of data) then it is their responsibility to ensure that
> the data that they are adding is valid, or at least as valid as the data
> that is already there.
>

Again, no, I am identifying rubbish, not introducing it, and I am very
actively replying to every comment I receive.  This is not "my data" - the
data is already in OSM in the form of the incorrect wikipedia tags. This
action is identical to what iD editor does - it *automatically* adds
corresponding wikidata ID, without any additional checks, and without many
users even being aware of it.  The way to solve the quality of this data is
to analyze it with the OSM+Wikidata tool I have built, to see the
mismatches.  Since there are tens (hundreds?) of thousands of issues
already in the database, it is clearly impossible to fix it by one person.
The available choices are:  me doing it by hand, and fixing a handful, or
make it possible to find problems, so everyone can fix them. (per Andy
Mabbett explanation)

Please go back and reread some of your previous replies on
> http://resultmaps.neis-one.org/osm-discussion-comments?uid=339581 .
> Things like "I will mostly work on high level objects (admin level <= 6)"
> suggests that you are at the very least being disingenuous in your dealings
> with the OSM community.
>

This was written a long time ago, before this effort was even started, and
before I have built the tools (OSM+Wikidata) to let community find issues.
Back then I had to do everything myself, and since it was clearly
impossible, I stopped after fixing the wast majority of the uncovered
issues by hand.


> Please stop this mechanical edit now and instead spend your time
> addressing the issues that have been raised.
>

I believe i have answered this numerous times above and in previous
conversations.  I cannot address tens of thousands of issues i *find*, I
can only help community see them, and do my part in fixing them.  Without
this effort, all the bad data in the form of incorrect wikipedia tags will
still be there, quickly rotting away with every wikipedia page rename.

P.S.  An interesting point was brought by Andy in the later online chat:

>
> in the case of https://www.openstreetmap.org/changeset/43749373 the
> errors were explicitly introduced by you.  The links from OSM to wikipedia
> were correct, the thing (probably a bot) creating the wikidata from
> wikipedia didn't understand the breadth of what the wikipedia article
> represented, and you incorrectly linked from OSM to the wikidata article.
>

Andy, Wikidata ID is not correct or incorrect -- it is simply a number
assigned to a Wikipedia article.  That number may have other statements,
which themselves may be incorrect. Adding Wikidata ID locks that Wikipedia
tag in place, to keep it from going stale - in case that page is renamed,
and in case a disambig is created in its place.  In some cases, the concept
presented in 

[OSM-talk] Fixing OSM wikipedia redirects

2017-09-25 Thread Yuri Astrakhan
At the moment, there are nearly 40,000 OSM objects whose wikipedia tag does
not match their wikidata tag. Most of them are Wikipedia redirects, whose
target is the right wikipedia article. If we are not ready to abandon
wikipedia tags just yet (I don't think we should ATM), I think we should
fix them.  Fixing them by hand seems like a huge waste of the community
time, when it can be semi-automated.

I propose that a small program, possibly a plugin to JOSM, would change
wikipedia tags to point to the target article instead of the redirect.

Thoughts?
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing OSM wikipedia redirects

2017-09-25 Thread Yuri Astrakhan
According to Martijn (of MapRoulette fame), there is no way a challenge can
link to object IDs. MapRoulette can only highlight location. Nor can I
provide a proposed fix, which means someone would have to manually find the
broken object, navigate to Wikipedia, copy/paste the title, and save the
object.  I guesstimate 1 minute per object on average... that's nearly 700
hours of community time - a huge waste of human brain power that could be
spent on a much more challenging and less automatable tasks.

Osmose might be a good alternative, and might even lower the total number
of hours required, but still - would that significantly benefit the
project?  These tags are just a tiny arbitrary subset of one million
wikipedia-tagged objects.  Verifying just them by hand seems like a waste
of human intelligence. Instead, we can run queries to produce knowingly bad
objects and let community fix those. I hope we can let machines do mindless
tasks, and let humans do decision making.  This would improve contributors
morale, instead of making them feel like robots :)

Clarifying: the OSM objects already point to those pages via redirect. The
redirect information is only stored in Wikipedia.

On Mon, Sep 25, 2017 at 11:18 PM, Marc Gemis <marc.ge...@gmail.com> wrote:

> or via Osmose ?
>
> On Tue, Sep 26, 2017 at 5:16 AM, Marc Gemis <marc.ge...@gmail.com> wrote:
> > what about a Maproulette task ?
> >
> > On Tue, Sep 26, 2017 at 5:11 AM, Yuri Astrakhan <yuriastrak...@gmail.com>
> wrote:
> >> At the moment, there are nearly 40,000 OSM objects whose wikipedia tag
> does
> >> not match their wikidata tag. Most of them are Wikipedia redirects,
> whose
> >> target is the right wikipedia article. If we are not ready to abandon
> >> wikipedia tags just yet (I don't think we should ATM), I think we
> should fix
> >> them.  Fixing them by hand seems like a huge waste of the community
> time,
> >> when it can be semi-automated.
> >>
> >> I propose that a small program, possibly a plugin to JOSM, would change
> >> wikipedia tags to point to the target article instead of the redirect.
> >>
> >> Thoughts?
> >>
> >> ___
> >> talk mailing list
> >> talk@openstreetmap.org
> >> https://lists.openstreetmap.org/listinfo/talk
> >>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-26 Thread Yuri Astrakhan
Yves, yes, they are external IDs. But so are wikipedia titles.  Visually
inspecting Wikipedia tile does not provide you with any way to verify its
correctness - you have to look in the external data source (WP).  As for
entering by hand - just like you shouldn't enter Wikipedia articles by hand
- you should copy/paste it from the article, or use the autocomplete field
in iD.  So in reality, these two things are nearly the same.  On the other
hand, modern rely on the internet connection, which means that an ID can be
shown as text in the user's language, together with other metadata from
Wikidata.  The concept of "internal" vs "external" is not as relevant now
as it was in the past...  (there is only one data - the internet :))

On Tue, Sep 26, 2017 at 1:43 PM, Yves <yve...@gmail.com> wrote:

> I think that the underlying issue in wikidata tags is that they are
> external IDs. Not human readable, they cannot be entered 'by hand' nor
> verified on the ground.
> Once you accept them in OSM, you can't really complain about bots.
>
> Yves (who still think such UIDs are only needed for the lack of good query
> tools).
>
>
>
> Le 26 septembre 2017 19:08:33 GMT+02:00, Yuri Astrakhan <
> yuriastrak...@gmail.com> a écrit :
>>
>> > p.s. OSM is a community project, not a programmers project, it's about
>>> > people, not software :-)
>>>
>>> It's both.  OSM is first and foremost is a community, but the result of
>> our effort is a machine-readable database.  We are not creating an
>> encyclopedia that will be casually flipped through by humans. We produce
>> data that gets interpreted by software, so that it can render maps and be
>> searchable.  For example, if every person uses their own tag names and ways
>> to record things, the data will have nearly zero value.  We must agree on
>> conventions so that software can understand our results - which is exactly
>> what we have been doing on wiki and in email channels. Any tag and value
>> that cannot be recognized and processed by software is effectively ignored.
>>
>>
>>>   Totally agree. If some script can automatically add new tag
>>> (wikidata) without any actual WORK needed, then it is pointless,
>>> anybody can run an auto-update script.
>>
>>   When ordinary (non geek) mappers do ACTUAL WORK - add wikipedia
>>> data, they add wikipedia link, not wikidata "stuff".
>>>
>>
>> While sand castles may look nice, they don't last very long. When
>> ordinary people add just the Wikipedia article, that link quickly gets
>> stale and become irrelevant and often incorrect. The wikipedia article
>> titles are not stable. They get renamed all the time - there are tens of
>> thousands of them in OSM already that I found.  Often, community renames wp
>> articles because there are more than one meaning, so they create a new
>> article with the same name in its place - a disambig page.  There is no
>> easy way to analyse wikipedia links for content - you cannot easily
>> determine if the wikipedia article is about a person, a country, or a
>> house, which makes it impossible to check for correctness.
>>
>> When I spend half an hour of my time researching which WP article is best
>> for an object, I do not want that effort to be wasted just because someone
>> else puts a disambig page in its place, and I have to redo all my work.
>>
>>   When data consumers want to get a link to corresponding wikipedia
>>> article, doing that with wikipedia[:xx] tags is straightforward. Doing
>>> the same with wikidata requires additional pointless and time
>>> consuming abrakadabra.
>>>
>>
>> no, you clearly haven't worked with any data consumers recently. Data
>> consumers want Wikidata, much more than wikipedia tags - please talk to
>> them. Wikidata gives you the list of wikipedia articles in all available
>> languages, it lets you get multi-lingual names when they are not specified
>> in OSM, it allows much more intelligent searches based on types of objects,
>> it allows quality controls.  The abrakadabra is exactly what one has to do
>> when parsing non-standardized data.
>>
>>>
>>>   Validation of wikipedia tag values can and IS already done using osm
>>> data versus wikipedia-geolocated data extracts/dumps.
>>>
>>> Sure, it can be via dump parsing, but it is a much more complicated than
>> querying.  Would you rather use Overpass turbo to do a quick search for
>> some weird thing that you noticed, or download and parse the dump?  Most
>> people would rather do the former. Here is the same thing - 

Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-26 Thread Yuri Astrakhan
>
> > p.s. OSM is a community project, not a programmers project, it's about
> > people, not software :-)
>
> It's both.  OSM is first and foremost is a community, but the result of
our effort is a machine-readable database.  We are not creating an
encyclopedia that will be casually flipped through by humans. We produce
data that gets interpreted by software, so that it can render maps and be
searchable.  For example, if every person uses their own tag names and ways
to record things, the data will have nearly zero value.  We must agree on
conventions so that software can understand our results - which is exactly
what we have been doing on wiki and in email channels. Any tag and value
that cannot be recognized and processed by software is effectively ignored.


>   Totally agree. If some script can automatically add new tag
> (wikidata) without any actual WORK needed, then it is pointless,
> anybody can run an auto-update script.

  When ordinary (non geek) mappers do ACTUAL WORK - add wikipedia
> data, they add wikipedia link, not wikidata "stuff".
>

While sand castles may look nice, they don't last very long. When ordinary
people add just the Wikipedia article, that link quickly gets stale and
become irrelevant and often incorrect. The wikipedia article titles are not
stable. They get renamed all the time - there are tens of thousands of them
in OSM already that I found.  Often, community renames wp articles because
there are more than one meaning, so they create a new article with the same
name in its place - a disambig page.  There is no easy way to analyse
wikipedia links for content - you cannot easily determine if the wikipedia
article is about a person, a country, or a house, which makes it impossible
to check for correctness.

When I spend half an hour of my time researching which WP article is best
for an object, I do not want that effort to be wasted just because someone
else puts a disambig page in its place, and I have to redo all my work.

  When data consumers want to get a link to corresponding wikipedia
> article, doing that with wikipedia[:xx] tags is straightforward. Doing
> the same with wikidata requires additional pointless and time
> consuming abrakadabra.
>

no, you clearly haven't worked with any data consumers recently. Data
consumers want Wikidata, much more than wikipedia tags - please talk to
them. Wikidata gives you the list of wikipedia articles in all available
languages, it lets you get multi-lingual names when they are not specified
in OSM, it allows much more intelligent searches based on types of objects,
it allows quality controls.  The abrakadabra is exactly what one has to do
when parsing non-standardized data.

>
>   Validation of wikipedia tag values can and IS already done using osm
> data versus wikipedia-geolocated data extracts/dumps.
>
> Sure, it can be via dump parsing, but it is a much more complicated than
querying.  Would you rather use Overpass turbo to do a quick search for
some weird thing that you noticed, or download and parse the dump?  Most
people would rather do the former. Here is the same thing - you *could* do
validation via a dump, but that barrier of entry is so high, most people
wouldn't.  With the new OSM+Wikidata tool, which is already getting
hundreds of thousands requests (!!!) , it is possible to get just the data
you need, and fix the problems that have been always present, but hidden.
And all that is possible because of a single tag.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing wiki* -> brand:wiki*

2017-09-27 Thread Yuri Astrakhan
I think we should re-start with the definition of the problems we are
(hopefully) trying to solve, or else we might end up too far in the
existential realm, which is fun to discuss, but should be left for another
thread.

* Problem #1:  In my analysis of OSM data, wikipedia tags quickly go stale
because they use Wikipedia page titles, and titles are constantly renamed,
deleted, and what's worse - old names are reused for new meanings.  This is
a fundamental problem with all Wikipedia tags, such as wikipedia,
brand:wikipedia, operator:wikipedia, etc, that needs solving. The solution
does not need to be perfect, it just needs to be better than what we have.

* Problem #2: the *meaning* of the "wikipedia" tag is ambiguous, and
therefor cannot be processed easily. The top three meanings I have seen are:
  a) This WP article is about this OSM feature (a so called 1:1 match, e.g.
city, famous building, ...)
  b) This WP article is about some aspect of this OSM feature, like its
brand, tree species, or subject of the sculpture
  c) Only a part of this WP article is about this OSM feature, e.g. a WP
list of museums in the area contains description of this museum.

* Problem #3: data consumers need cleaner, more machine-processable data.
The text label is much more error prone than an ID:  McDonalds vs mcdonalds
vs McDonald's vs ..., so having "brand=mcdonalds" results in many errors.
Note that just because OSM default map skin may handle some of them
correctly, each data consumer has to re-implement that logic, so the more
ambiguous something is, the more likely it will result in errors and data
omissions.

The brand:wikidata discussion is about #1, #2b, and #3.

Are we in agreement that these are problems, or do you think none of them
need solving?
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Overlapping brands (was "Fixing wiki* -> brand:wiki*")

2017-09-27 Thread Yuri Astrakhan
That's exactly what we are trying to do.  Add another tag --
brand:wikidata=Q550258

On Wed, Sep 27, 2017 at 4:10 PM, yvecai  wrote:

> Excuse me, but what does wikidata do in this discussion ?
> If brand=wendy is different tham brand=wendy, and if somebody has a
> problem with is it, why not change the key, values or add another tag,
> document it and voila ?
> Yves
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Overlapping brands (was "Fixing wiki* -> brand:wiki*")

2017-09-27 Thread Yuri Astrakhan
Martin, you cannot make a general claim based on a single value.  Users can
enter "Aldi", or "Aldi Nord" or "Aldi Sud". With different capitalization
and dashes, and with or without dots, and god knows what other creative
ways to misspell it. Specifying Q125054 is the same as specifying "Aldi".
If needed/wanted, it could be replaced with the more specific wikidata
entry like Aldi Nord.

On Wed, Sep 27, 2017 at 3:50 PM, Martin Koppenhoefer  wrote:

>
>
> sent from a phone
>
> On 27. Sep 2017, at 17:57, Andy Townsend  wrote:
>
> In Germany both Aldi Nord and Aldi Sud operate, but these tend to be
> tagged in OSM as operator rather than brand, and the only wikidata example
> I can find is https://www.openstreetmap.org/way/25716765
>
>
>
>
> which btw. is another good example of misleading and wrong information via
> wikidata. There’s no indication that there are 2 groups of companies (Aldi
> nord and süd) each of which consisting of many companies, instead it seems
> Aldi is one single company “GmbH & Co. KG” (property legal form). There’s
> not even the full name, nor a vatin. It also suggests that the aldi nord
> logo is the logo for this aldi süd instance.
>
>
> If you tag operator=Aldi Süd there’s no problem, but if you tag
> operator:wikidata=
> Q125054
> you introduce errors and uncertainties.
>
>
> cheers,
> Martin
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing wiki* -> brand:wiki*

2017-09-27 Thread Yuri Astrakhan
Yves, see above - I listed 3 problems that I would like to solve. Do you
agree with them?
-- Dr. Yuri :)

On Wed, Sep 27, 2017 at 2:44 PM, Yves <yve...@gmail.com> wrote:

> I add a look at http://wiki.openstreetmap.org/wiki/Key:brand:wikidata
> Wow.
> So, this tag is about adding an external reference that explains what the
> tag is? Really? This is not a joke?
>
> OSM is sick, please somebody call a doctor.
> Yves
>
>
On Wed, Sep 27, 2017 at 12:14 PM, Yuri Astrakhan <yuriastrak...@gmail.com>
 wrote:

> I think we should re-start with the definition of the problems we are
> (hopefully) trying to solve, or else we might end up too far in the
> existential realm, which is fun to discuss, but should be left for another
> thread.
>
> * Problem #1:  In my analysis of OSM data, wikipedia tags quickly go stale
> because they use Wikipedia page titles, and titles are constantly renamed,
> deleted, and what's worse - old names are reused for new meanings.  This is
> a fundamental problem with all Wikipedia tags, such as wikipedia,
> brand:wikipedia, operator:wikipedia, etc, that needs solving. The solution
> does not need to be perfect, it just needs to be better than what we have.
>
> * Problem #2: the *meaning* of the "wikipedia" tag is ambiguous, and
> therefor cannot be processed easily. The top three meanings I have seen are:
>   a) This WP article is about this OSM feature (a so called 1:1 match,
> e.g. city, famous building, ...)
>   b) This WP article is about some aspect of this OSM feature, like its
> brand, tree species, or subject of the sculpture
>   c) Only a part of this WP article is about this OSM feature, e.g. a WP
> list of museums in the area contains description of this museum.
>
> * Problem #3: data consumers need cleaner, more machine-processable data.
> The text label is much more error prone than an ID:  McDonalds vs mcdonalds
> vs McDonald's vs ..., so having "brand=mcdonalds" results in many errors.
> Note that just because OSM default map skin may handle some of them
> correctly, each data consumer has to re-implement that logic, so the more
> ambiguous something is, the more likely it will result in errors and data
> omissions.
>
> The brand:wikidata discussion is about #1, #2b, and #3.
>
> Are we in agreement that these are problems, or do you think none of them
> need solving?
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Overlapping brands (was "Fixing wiki* -> brand:wiki*")

2017-09-27 Thread Yuri Astrakhan
Marc, I think you are confusing the goal and the means to get there.  I
agree - the goal is to be able to globally find all Wendy's, so that when I
travel, I still can search for familiar brands.  So the same brand should
have the same ID everywhere.  That ID can be either textual or numeric.
Both approaches have pros & cons.  The ID can be defined inside OSM - in
which case we must have a globally-coordinated effort to standardize and
document them - e.g. on a wiki, or we can use external IDs.  We already use
some external IDs like ISO-defined "USA" or "TX", but Wikidata is clearly a
much less standard, user-contributable system. I am not sure OSM community
should duplicate the effort of Wikipedia community to redefine concepts.
Look at "denomination" tag - OSM has a long list of values for it, but most
of them are links to Wikipedia.

Lastly, lets not confuse what we *store* (DB) and how we enter/display (UI)
a value.  Data consumers would value cross-referencable IDs much more. The
users on the other hand should see proper text, preferably with additional
things like company's logo.  Wikidata would allow you to show that logo,
and a company description in a dropdown. Text wouldn't.

P.S. Snackbar Wendy's should be in OSM, and judging by media and legal
attention, it should also be in Wikipedia, or at least in Wikidata (which
has much lower barrier of entry).  The searching for it is tricky
- https://en.wikipedia.org/wiki/Goes#Fast_food

P.P.S. I applaud local Wendy's. I am not a big fan of having identical food
brands in every corner of the globe, but that's a personal taste preference

On Wed, Sep 27, 2017 at 2:47 PM, Marc Gemis  wrote:

> >
> > Can anyone think of an example where two unrelated brands share the same
> > name and category of business in the same geographical area?
>
> Is "the same geographical area" relevant ? Why should a data consumer
> use a separate datebase to identify the brand of an item ?
>
> Suppose I want to find all "Wendy's". Why do I need to know that the
> one in The Netherlands does not  belong to the brand found in the US ?
> [1] Shouldn't this be part of the OSM data in some way ?
>
> regards
>
> m
>
> [1] https://www.businessinsider.nl/een-zeeuw-noemde-zn-
> snackbar-wendys-naar-zn-dochter-en-weerstaat-de-gelijknamige-amerikaanse-
> fastfoodgigant/
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing wiki* -> brand:wiki*

2017-09-27 Thread Yuri Astrakhan
>
> That formed no part of the early discussions on how wikidata should
> work? I bowed out when the discussions were going down a path I did not
> find to be at all useful. The current offering is certainly a lot more
> 'organised' than those original discussions.

Getting the initial points across is always a process. Hard to get it right
from the start :)  I hope we can progress in a more organized and
beneficial way.


> I WOULD still like to see a
> storage model that allows third party lists to be managed and cross
> referenced, but that does not fit the wikidata model. It is why I think

'another' cross-reference tool may be more appropriate with OSM and
> wikipedia/wikidata simply being sources.

I am not exactly sure what you mean here. What goals do you have in mind
that cannot be stored with the current system?


> THAT requires OSM to have a
> 'unique id' one can use to cross reference though :(

 If a third party list has a list of OSM objects, any time a new object is
added or existing one is changed, that 3rd party list needs to be updated.
Generally you don't want that. Also, it would require a substantial
fundamental change to OSM data structure and social dynamics - the "ID"
would have to be placed above all else, like it is done in Wikidata.  The
ID meaning should never change, and merging two IDs should leave a redirect
from one to the other.  I doubt that can be easily achievable.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-30 Thread Yuri Astrakhan
Verifiability is critical to OSM success, but it does not mean it must only
be  verifiable by visiting the physical location. Tags like "wikipedia",
"wikidata", "url", "website" and some IDs cannot be verified that way.  You
must visit some external website to validate. Stopping by Yellowstone
National Park or a statue in the middle of a city may tell you its national
registration number, but most likely you will have to visit some government
website. Seeing some complex URL tells you nothing about its correctness
unless you visit that web site.

Yet, we are not talking about the last two examples.  Node 153699914 has
wikipedia="Eureka, Wisconsin".  It looks fine to a casual examiner, but in
reality is a garbage link to a disambiguation place - a list of 3 different
places, which you wouldn't know unless you visit the external site -
Wikipedia.  I have uncovered many thousands of such cases, and many of them
have already been fixed thanks to a stronger IDing system.  Yet, every day
there is more of them - because Wikipedia keeps renaming things, and
several people refuse to allow Wikidata IDs.

Wikipedia created a stable ID system for these pages. Its called Wikidata.
Please view Wikidata as first and foremost a linking system to Wikipedia
articles.  It is NOT perfect. It has many issues. But it is simply much
better than linking to Wikipedia articles by their names because they don't
break as often.

Andy, you keep saying Wikidata is not verifiable data - but that's because
you keep insisting on separating it from Wikipedia. We can already make it
so that when you click on Wikidata link, you are taken directly to
Wikipedia. The statements on Wikidata entries are a major bonus for
automated verification and other things, but it should be viewed in
addition to the redirecting capability, not as a replacement to Wikipedia
pages.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-01 Thread Yuri Astrakhan
Christoph, I am not talking about OSM or Wikidata or Wikipedia quality or
approaches. Please don't read more into it than what I am trying to state.

If we say that we want OSM objects to link to Wikipedia (and we clearly do,
judging by the number of wikipedia tags people have created), we need a
good way to do it.

Linking to Wikipedia with the page titles is bad. It is not stable.
Wikidata tags fixes that.  No other claim is being made here.

On Sun, Oct 1, 2017 at 5:06 AM, Christoph Hormann <o...@imagico.de> wrote:

> On Sunday 01 October 2017, Yuri Astrakhan wrote:
> >
> > Wikipedia created a stable ID system for these pages. Its called
> > Wikidata. Please view Wikidata as first and foremost a linking system
> > to Wikipedia articles.  [...]
> >
> > Andy, you keep saying Wikidata is not verifiable data - but that's
> > because you keep insisting on separating it from Wikipedia.
>
> Wow, i had to read this twice to really believe what i was reading here.
> Seems you are still in deep denial about the fundamental differences
> between OSM and Wikipedia.
>
> Wikipedia is based on secondary sources, it rejects original research.
> Therefore you can find a lot of nonsense on Wikipedia - all kind of
> urban legends and things like that, especially about remote areas, as
> long as everyone believes them and no one bothers to proof them wrong
> and rebut them outside of Wikipedia.  So in a way Wikipedia documents
> societies current beliefs about the world, not the world itself.  This
> does not necessarily have to go as far as an article about something
> fictitious claiming to be about a real world thing, often its smaller
> stuff like X being an object of type Y.  The iconic 'citation needed'
> of Wikipedia is not about the information being in need of actual
> verification as a fact, it is about this information being verified to
> be something well integrated into societies' belief system.
>
> OSM is fundamentally different in that because it is based on
> verification by original research.  This does not mean everything in
> OSM holds up to this standard but we aim for this and value information
> that is practically verifiable by local mappers and tagging concepts
> that are targeted at verifiable mapping more than other information
> that people always will keep adding to OSM to some extent despite it
> being non-verifiable.
>
> It also means information in OSM is inherently more variable because
> what people observe on the ground varies - both because what people see
> depends on their experience and background and because appearance of
> reality, especially of natural features, varies over time.  OSM with
> its original research research focus lacks the unifying and consistency
> preserving effect of the filter through secondary sources you have in
> Wikipedia.
>
> What you do when you mechanically 'fix errors' and correct discrepancies
> between tags in OSM that contradict the Wikipedia/Wikidata information
> is you impose the value system of Wikipedia onto OSM.
>
> --
> Christoph Hormann
> http://www.imagico.de/
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-01 Thread Yuri Astrakhan
On Sun, Oct 1, 2017 at 11:12 AM, Tomas Straupis 
wrote:

> I guess the point is that:
> 1. Its ok to play with some pet-tag like wikidata
>
100 % agree


> 2. Its not a WORK to automatically update one osm tag according to another
> osm tag (anybody can do it online/locally/etc). It adds NO value.
>

It adds HUGE value, as was repeatably shown. Thanks to Wikidata IDs, the
community was able to see and fix tens of thousands of errors in Wikipedia
tags. The Wikipedia link improvement project is based on it:
https://wiki.openstreetmap.org/wiki/Wikipedia_Link_Improvement_Project
Also, there is no point to add it online/locally because that wouldn't help
community to find these errors.

3. It is totally unacceptable to introduce idea that wikipedia tag could be
> removed at some time, because some other new automatically filled tag has
> been introduced.
>

First, it is always acceptable to introduce and discuss new ideas. Any
ideas. Always. We, as a community, don't have to accept them, but
discussing innovations is always a good thing.  That said, the removal of
wikipedia tag is NOT being discussed here. We are discussing the way to
improve them, because they are currently broken. Badly.


> So if you like wikidata tag - go ahead and enjoy it, but do not tuch
> wikipedia tag with autoscripts because people are actually using it.
> Especially when you not only avoid discussing with local communities, but
> ignore active requests from local communities to stay away.
>
> Tomas, 1) i don't have autoscripts to touch wikipedia tag, I use JOSM to
generate wikidata tags because of the benefits it provides 2) i am
providing a way for community to fix the broken wikipedia tags, 3) I have
been very actively talking to many communities (in, ru, de, fr, ...). Also,
please elaborate which community has asked me to stay away??? A very broad
statement, considering that every single community had many members
supporting this effort. And 4) could you elaborate on who uses wikipedia
tags, and how they are being used? It would greatly help to understand
various use cases for such data.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-01 Thread Yuri Astrakhan
On Sun, Oct 1, 2017 at 1:29 PM, Tomas Straupis <tomasstrau...@gmail.com>
wrote:

> 2017-10-01 20:04 GMT+03:00 Yuri Astrakhan:
> >> 2. Its not a WORK to automatically update one osm tag according to
> another
> >> osm tag (anybody can do it online/locally/etc). It adds NO value.
> >
> > It adds HUGE value, as was repeatably shown. Thanks to Wikidata IDs, the
> > community was able to see and fix tens of thousands of errors in
> Wikipedia
> > tags. <...>
>
>   It is mostly because you pushed the effort, not beaucse of
> "advantage of wikidata". The same fixing has already been done for
> YEARS before your effors based on wikipedia tags only.


Tomas, you claimed that "It adds NO value."  This is demonstrably wrong.
You are right that the same fixing was done for years. But until wikidata
tag, there was no easy way to FIND them.  About a year ago when I first
started this project, I created lists of thousands of such errors, that
were very rapidly fixed once they were identified. This was not possible
before.


> And fixing
> wikipedia tags is in no way inferior to your method. Maybe even
> better, because it involves less „geekiness“ - they are more
> understandable to larger portion of OSM community.
>
> My method is for finding broken wikipedia tags. What method are you
talking about? Can you describe what method you use to identify errors?

>> 3. It is totally unacceptable to introduce idea that wikipedia tag could
> >> be removed at some time, because some other new automatically filled
> tag has
> >> been introduced.
> >
> > First, it is always acceptable to introduce and discuss new ideas. Any
> > ideas. Always. <...>
>
>   Yes. But when you're told by numerous people numerous times that
> current mechanism works, and there is nothing BETTER in your advice
> (other than your theoretical rambilngs), you cannot advice to destroy
> existing working mechanism.
>

Here is the DATA for my "theoretical ramblings". Can you show any data to
back your theoretical ramblings?
  https://commons.wikimedia.org/w/index.php?title=Data:
Sandbox/Yurik/OSM_objects_pointing_to_disambigs.tab=history
Now there is a simpler http://tinyurl.com/ybv7q7n6 query - it used to have
about 1300, now down to ~750. And these are JUST the disambig errors. There
are many other types as I listed in the Wikipedia improvement project on
osm wiki. Lastly, what am I proposing to destroy?!?  I am ADDING a tag and
ADDING a new search mechanism, because there is current no reliable
mechanism to fix these things.


> > We are discussing the way to improve them,
> > because they are currently broken. Badly.
>
>   And they are perfectly being fixed without involving wikidata tags
> there, where people WANT to do that and do WORK to fix them.
>
> Do you have any data to back that up?  When I first looked at them,
Wikipedia links were often incorrect (see links above). Now they are fixed
thanks to all the work done by the communities. Yes, all that manual work
that people did. But in order to WORK, you need to FIND issues first.


> > Also, please elaborate which community has asked me to stay away???
>
>   Lithuania. We are in active action on not only fixing wikipedia
> tags, but also adding missing tags to OSM, adding missing coordinates
> to wikipedia, aligning coordinates between OSM and wikipedia etc. For
> YEARS!
>

This is wonderful that you are fixing all these issues, could you tell me
how you find them? Also, funny enough, I used to live in Vilnius a long
time ago, near Gineitiškės. Should I be allowed to edit there? (I hope this
doesn't lead to another huge but unrelated discussion :) )

>
>   When we create a POI detail page, we want to add a link (url without
> redirects) to a wikipedia page. To do that it is straightforward to
> use a value in wikipedia tag.
>

Great, thanks. As you can see, nothing in what I do breaks that. Instead,
it actually helps your POI links to be more accurate.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-01 Thread Yuri Astrakhan
John, I guess it is always good to talk as a data scientist - with numbers
and facts. Here's why matching by coordinates would not work.  This query
calculates the distance between the OSM nodes, and the coordinates that
Wikidata has for those nodes. I only looked at nodes, because ways and
relations are even more incorrect - Wiki only has a center point.  The
results are bucketed by the distance (in km) - the bigger the distance, the
bigger the mismatch between OSM and Wikipedia.   As you can see,  only a
small number of nodes are accurate to 10 meters.Query:
http://tinyurl.com/ybp4tp7a

diff in km number of nodes
<0.01 75,027
<0.1 131,644
<0.5 147,637
<1 46,891
<2 28,049
<5 10,792
<10 3,537
10+ 7,239

Is this a convincing argument why we should have a Wikipedia/Wikidata link,
as oppose to calculate it?

The other issue is why we need Wikidata links - while I have said it many
times, let me say it again.  Because the current system is badly broken -
as is evident by tens of thousands of errors that my approach has
uncovered.  I am not advocating to delete Wikipedia tag. Only that when you
use wikipedia tag, it creates a burden on the community to maintain, and
community is clearly unable to keep up with the changes on the Wikipedia
side. So instead of using just the bad link (page title), I am advocating
to use a good link (wikidata).  We are already using it for 90%. Why not
fill in the last 10%?  It does not change anything of how you do your
mapping. It simply helps those who want to fix errors, or view
corresponding wikipedia articles even if it gets renamed.

On Sun, Oct 1, 2017 at 8:50 PM, john whelan  wrote:

> >Assuming my above arguments has convinced you
>
> No I still do not see a requirement here, but there again I'm only part of
> the community and that's the concern you appear to be ramming this down our
> threats.  As for what iD does or does not do, I don't see that is relevant.
>
> Why does OSM need it and why are you unable to put forth a convincing
> argument that is accepted by the community?   A ninety percent acceptance
> rate will be fine but I'm not seeing it.
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-01 Thread Yuri Astrakhan
On Sun, Oct 1, 2017 at 8:15 PM, john whelan  wrote:

> Since an OSM object has  lat and long value and it appears that wiki
> whatever also has one the entries can be linked.
>

Not so.  The data is very often different between wikipedia, wikidata, and
OSM. Also, the same location could be a square, a famous sculpture within
that square, and some commemorative plaque on it, and all could have some
wikipedia/wikidata entry. Matching them up requires humans, and cannot
reliably be done by an algorithm in a large number of cases. Lastly, if the
coordinates are different, you may not copy it from OSM to Wikidata because
of the difference in the license.

>
> "This gives you a very simple table with: lat/lon/page_title.
>   No parsing or anything else involved.
>   You then take data from OSM - lat/lon/wikipedia.tag
>   So you have two tables of same structure. Voila. You can compare
> anything (title, coordinates), in any direction with some
> approximation if needed etc. No OSM wikidata involved at all."
>

See above,  this cannot be done with any reasonable reliability by
automatic means. You will end up with an incredible amount of unreliable
data. Feel free to discuss deleting of both Wikipedia and Wikidata tags,
but I seriously doubt the community will go for it.

>
> I really don't understand why wikidata needs to be added.  Note the word
> need, I'm missing the requirement somehow that overides following normal
> OSM practices.
>

Assuming my above arguments has convinced you -- that we must manually
determine the match between an OSM feature and a Wikipedia article, lets
discuss how best to link to Wikipedia.  There are two options: link by
article title, and link by Wikidata ID. The first one causes many errors -
because titles get renamed, and old titles are reused for other meanings.
The second approach is less readable when looking at the tag, but it is
much more stable.  Its as simple as that.  One approach causes errors, the
other approach is more stable.  Both point to Wikipedia article, just using
a slightly different URL internally.

Automatically adding Wikidata tags is already being done by iD. I would
like to finish that process, so that the community can clean up all the
mistakes that are hiding in the OSM db.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-02 Thread Yuri Astrakhan
>
>
>   I will repeat that this is not something which COULD be done, this
> comparison is something, what IS ACTUALLY DONE and has been done for
> years.


Tomas, this is what I understand from what you are saying:
* You download a geotagging wikidata dump and generate a table with
latitude, longitude, and a wiki page title.
* You also generate the same table from OSM for all nodes, ways (using geo
centroid?), and relations (using ??)
* you compare article titles between the two, and when OSM has something
that Wikipedia doesn't, you search automatically by geo proximity, or you
let users fix it or ??

If I understood you correctly (and please correct my understanding if I did
not), it wouldn't work for the whole planet, simply because the average
distance between what OSM has and what Wikidata has is far too great to be
useful.  Maybe Lithuania, being a relatively small area with a very active
community has been kept up in a perfect form (and each geo point is
identical in both Wikidata & OSM, which might be a licensing issue), but
the current state of the world OSM data is that there are only 17% of nodes
are within 10 meters of their Wikidata counterpart. If we count ways and
relations, it drops to 11% -- http://tinyurl.com/ybp4tp7a

In other words, with your approach, you can detect when OSM's wikipedia tag
is no longer correct, because Wikipedia geo dump no longer has it. But
afterwards you have to go and fix it by hand.  And this is pretty much the
only operation you can do with this approach.  You cannot analyze tens of
thousands of existing wikipedia tags that are pointing to links, disambigs,
people, tree species, places of business - you can simply mark them as "geo
missing in Wikipedia".

I took a quick look at the various quality control queries I built on the
cleanup page.  Lithuania does seem pretty clean, with only one
disambiguation at the moment (has been there for 4 months) -
https://www.openstreetmap.org/node/1717783246 - but both have the same
location, two airports that point to a list -
https://www.openstreetmap.org/node/1042034645 and
https://www.openstreetmap.org/node/1042034660 . None of these issues are
possible to find with your approach, or detect renaming. For the rest of
the world, the situation is much worse.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-10-01 Thread Yuri Astrakhan
On Sun, Oct 1, 2017 at 3:45 PM, Tomas Straupis 
wrote:

> > Tomas, you claimed that "It adds NO value."  This is demonstrably wrong.
> You
> > are right that the same fixing was done for years. But until wikidata
> tag,
> > there was no easy way to FIND them.
>
>   There always was.
>   You simply take wikipedia provided geo-tags dump like
> https://dumps.wikimedia.org/ltwiki/latest/ltwiki-latest-geo_tags.sql.gz
>
>   This gives you a very simple table with: lat/lon/page_title.
>   No parsing or anything else involved.
>   You then take data from OSM - lat/lon/wikipedia.tag
>   So you have two tables of same structure. Voila. You can compare
> anything (title, coordinates), in any direction with some
> approximation if needed etc. No OSM wikidata involved at all.
>

Thomas, this will not work. Matching wikidata & osm by coordinates is
useless, because the coordinates differ too much -- see the hard data proof
in the prev email.  The only way you can make any useful calculation is if
you analyze the entirety of Wikidata graph, and merge it with OSM objects,
and expose it to other users so that they can figure out what is right or
broken.  That's exactly what my Wikidata+OSM service allows users to do.


>   If wikipedia page moves - title is gone from this dump and the new
> one appears on the same coordinates. You can map them very quickly.
> Theoretically you can update OSM data automatically, but usually if
> wikipedia title has changed, it means that something has changed in
> the object on the ground, so maybe something else has to be changed in
> OSM data as well (for example name).
>

Again - not possible - because coordinate matching is mostly useless.
Also, no, usually wikipedia titles change not because something changed on
the ground, but because of a conflict with a similarly named place
somewhere else. People usually rename the original page to a more specific
name, and create a new page in its place listing all the disambiguations.
This is what breaks titles most often. We now have about 800 left (after
thousands already fixed), plus potentially thousands more of those that
have not been tagged with wikidata tag yet.

>
> I'm just saying the same could be done without wikidata tags.
>

As explained by me in one of the first emails, and by Andy, and a few
others, it cannot be done **as easily**. You can build a complex system if
you have enough disk space (~1TB), and do a local resolve of wikipedia ->
wikidata, and build a complex service on top of it.  Or you can simply add
a single tag that has already been added to 90% of cases, and use
off-the-shelve query engine to merge the data, and let everyone use it.

>
>   See above. What are practical advantages of your method?
>   Because theoretically you are taking a set A, creating a new set B
> from this A, and then you're trying to fix A according to B. This is
> logical nonsense :-) There is no point of putting this B into OSM.
> This is a temporary data which could be stored in your local "error
> checking" database.
>

Strawman argument :)   For each object that has a tag, I use JOSM to get
corresponding wikidata tag, and upload that data to OSM.  The moment it is
uploaded, other systems, such as my wikidata+osm service, get that data.
Then community, without my involvement, can analyze the data with many
different queries, and fix all the errors they find.  If I haven't uploaded
the data to OSM, only I would be able to see it, and only I would be able
to fix it.  I don't know all the different ways community may query the
data (I'm already getting hundreds of thousands of queries). Its a tool
that helps community.

>
>   550 objects globally... Well... :-) You should see from here, that
> the problem is finding people who want to FIX, not finding problems...
>

750 is number NOW. It used to be many thousands. And was all fixed, by
volunteers. For just the most obvious of queries.  There are many more
fixes that needs to happen - see wikipedia link cleanup project on osm
wiki.  So once the problems are identified, they get solved. Finding them
is the problem.


> I'm arguing against idea that wikipedia tag is outdated or in any way
> worse.


But this is exactly what I have been showing with my data about broken
tags. Do you have any data to say that it is not worse?


> Yes, OSM would not be born
> without a geek idea, but it would not have reached what it is now if
> it would not be easy to understand for non geeks. Wikidata tag is
> totally non-understandable to non-geeks.
>

Wikidata does not need to be understood by geeks or non-geeks. It's an ID,
and everyone understands that concept, and most people don't touch tags
they don't understand. Just like mapillary ID, or tons of other local
government IDs.  The tools we have, like iD editor, can easily work with
these IDs without non-geeks as you call them understanding it. The query
system also doesn't need to be understood to be used - you simply share the
link 

Re: [OSM-talk] Fixing wiki* -> brand:wiki*

2017-09-27 Thread Yuri Astrakhan
Lester, first and foremost, Wikidata is a system to connect the same
Wikipedia articles in different languages. The "read this article in
another language" links on the left side comes from Wikidata.  Wikidata has
developed beyond this initial goal, but it remains the only way to identify
Wikipedia articles in a language-neutral way, even if a specific Wikipedia
article is renamed or deleted.

On Wed, Sep 27, 2017 at 2:13 PM, Lester Caine  wrote:

> On 27/09/17 17:40, Andy Mabbett wrote:
> >> Not on a number of articles I've recently been looking at while checking
> >> out the CURRENT wikidata offering. I've not found wikidata id's on the
> >> wikipedia articles I looked at ... but wikidata does seem something I
> >> should perhaps reassess.
> > You not having found them does not mean that they are not there.
> >
> > teh only Wikipedia articles with no Wikidata ID are those that are
> > newly created.
> >
> > Otherwise, you will find the "Wikidata item" link in the left-hand
> > navigation pane (using the default desktop view)
>
> AH - so it's stripped when you grab a printed copy of the article :(
> There may be a link to Wikimedia Commons material, but not to wikidata
> material in external links ... that is where I expected to find it ;)
>
> --
> Lester Caine - G8HFL
> -
> Contact - http://lsces.co.uk/wiki/?page=contact
> L.S.Caine Electronic Services - http://lsces.co.uk
> EnquirySolve - http://enquirysolve.com/
> Model Engineers Digital Workshop - http://medw.co.uk
> Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Overlapping brands (was "Fixing wiki* -> brand:wiki*")

2017-09-27 Thread Yuri Astrakhan
>
> > Specifying Q125054 is the same as specifying "Aldi". If needed/wanted,
> it could be replaced with the more specific wikidata entry like Aldi Nord.
>
> no, it’s not the same, because this wikidata object suggests that there is
> one company, Aldi GmbH & Co. KG, with 2 seats, and one logo.
> Specifying operator:wikipedia=en:Aldi would be similar to specifying
> operator=Aldi and it would be much more precise, because it tells you
> there’s 2 chains of this name, and they are not “supermarkets” but discount
> supermarkets which is very different.
>

Martin, that specific Wikidata item may have some, possibly incomplete
data, that can be easily fixed, but that's irrelevant. As I keep saying -
the wikidata and wikipedia tags are no different - both point to the same
article in Wikipedia. if you want, think of Wikidata as the redirect system
with stable IDs. We don't mind storing "website" with arbitrary URL, why
not store wikipedia with a stable URL?  operator:wikipedia=en:Aldi is bad
simply because it "en:Aldi" is frequently renamed. The wikidata tag can
easily be shown as text by all the tools (some already do). I feel like we
are going the 2nd or 3rd circle by now.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-27 Thread Yuri Astrakhan
I have been fixing nodes that have wikipedia but no wikidata tags [1], and
even the first two randomly picked nodes had identical problem - article
was renamed (twice!) without leaving redirects  - node 1136510320

Try it yourself - run the query and see what the it points to.
[1]
https://wiki.openstreetmap.org/wiki/Wikipedia_Link_Improvement_Project#Missing_Wikidata_tags

Imre, I think at this point it might be better to have both, just as a
safety check. But I can already see that they get misaligned - articles
keep getting renamed, so we will be stuck mindlessly updating wikipedia
tag. Feels a bit like a busywork for the sake of work, but might be needed
for a bit.

On Wed, Sep 27, 2017 at 6:00 PM, Imre Samu  wrote:

> > I hope everyone realizes that there are Wikidata items for which there
> > is no Wikipedia article.
> > So you cannot always find it via Wikipedia  tags.
> >  And at least JOSM shows a human readable name of a Wikidata item
> > besides the Q-number. I think iD does this as well.
> > m. (who manually adds Wikidata references for Flemish churches after
> creating the Wikidata items).
>
> imho:
> probably you have a local and domain knowledge on the topic of "Flemish
> churches"
> but for me:  wikidata without wikipedia page - is  extremely suspicious
>
> because:
>
> #1.  Sometimes the " nearby" search for geolocated articles/wikidataids is
> not enough
> for example:
> * at least ~28000 churches exist in the wikidata without coordinates:
> http://tinyurl.com/y8nyk9zw
>
> And probably we will also find wikidata cities without coordinates.
>
> #2. And we should aware of the current "Parallel geo worlds" problem in
> the wikidata[1]
> for example:
> Arad ( major City in Romania ) has 3 wikidata, and we should prefer id
> with wikipedia pages.
> * https://www.wikidata.org/wiki/Q173591 ( with wikipedia pages, linked to
> OSM )
> * https://www.wikidata.org/wiki/Q31886684 (  created by Cebuano import
> [1] ~1 month ago) * https://www.wikidata.org/wiki/Q16898082
>
> [1] wikidata cebuano import problem:
> * https://www.wikidata.org/wiki/Wikidata:Project_chat/Archive/
> 2017/08#Dealing_with_our_second_planet * https://www.wikidata.org/wiki/
> Wikidata:Project_chat/Archive/2017/08#Nonsense_imported_from_Geonames
>
>
> Imre
>
>
>
>
>
>
> 2017-09-27 5:03 GMT+02:00 Marc Gemis :
>
>> On Wed, Sep 27, 2017 at 1:17 AM, Andy Townsend  wrote:
>> > That's simply rubbish.  Tags on an OSM object describe it in the real
>> world.
>> > They should be verifiable.  Whether an OSM object has a wikidata tag on
>> it
>> > is essentially irrelevant as far as OSM is concerned - it's just a
>> primary
>> > key into an external database.  External data consumers might find the
>> data
>> > in that database useful, but they can also get to it via wikipedia tags
>> > (which, being human-readable, are more likely to be maintained), so it's
>> > really not a big deal.
>>
>>
>> I hope everyone realizes that there are Wikidata items for which there
>> is no Wikipedia article. So you cannot always find it via Wikipedia
>> tags.
>>
>> And at least JOSM shows a human readable name of a Wikidata item
>> besides the Q-number. I think iD does this as well.
>>
>> m. (who manually adds Wikidata references for Flemish churches after
>> creating the Wikidata items).
>>
>> ___
>> talk mailing list
>> talk@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk
>>
>
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing OSM wikipedia redirects

2017-09-26 Thread Yuri Astrakhan
Sarah, my understanding is that MapRoulette does not support it -- I cannot
upload the following:

For Object ID ,  change one set of tags for another -- accept or
decline?

Would be great if I was wrong.

Yes, the community can do it, the question is - should it?  Given 2
challenges, one that requires some thought, and the other that require
clicking yes without thinking, shouldn't we opt to the one that computers
cannot do?  We seemed to assume that donated time is free, unlimited, and
has very little value. I feel we should treat donated time as the most
precious and most scarce resource we could get.

On Tue, Sep 26, 2017 at 3:48 AM, Sarah Hoffmann <lon...@denofr.de> wrote:

> On Mon, Sep 25, 2017 at 11:53:03PM -0400, Yuri Astrakhan wrote:
> > According to Martijn (of MapRoulette fame), there is no way a challenge
> can
> > link to object IDs. MapRoulette can only highlight location. Nor can I
> > provide a proposed fix, which means someone would have to manually find
> the
> > broken object, navigate to Wikipedia, copy/paste the title, and save the
> > object.  I guesstimate 1 minute per object on average... that's nearly
> 700
> > hours of community time - a huge waste of human brain power that could be
> > spent on a much more challenging and less automatable tasks.
>
> We'd have 40.000 more recently reviewed objects in the database. Given how
> much the quality of the OSM data decays with time, I would consider that
> a welcome boost to overall quality.
>
> And my experience with the OSM community is that there are a lot of people
> who wouldn't consider such a task a waste of time but as a wonderful
> opportunity to relax in the evening. Maproulette has the advantage that
> you can just click away and do one object after the next. I would recommend
> to break the 40.000 objects into local batches of 1000 or 2000 and just
> load it into Maproulette. Add step-by-step isntructions how to fix the
> links and I'm sure you'll be surprised how quickly everything is done.
>
> Kind regards
>
> Sarah
>
> >
> > Osmose might be a good alternative, and might even lower the total number
> > of hours required, but still - would that significantly benefit the
> > project?  These tags are just a tiny arbitrary subset of one million
> > wikipedia-tagged objects.  Verifying just them by hand seems like a waste
> > of human intelligence. Instead, we can run queries to produce knowingly
> bad
> > objects and let community fix those. I hope we can let machines do
> mindless
> > tasks, and let humans do decision making.  This would improve
> contributors
> > morale, instead of making them feel like robots :)
> >
> > Clarifying: the OSM objects already point to those pages via redirect.
> The
> > redirect information is only stored in Wikipedia.
> >
> > On Mon, Sep 25, 2017 at 11:18 PM, Marc Gemis <marc.ge...@gmail.com>
> wrote:
> >
> > > or via Osmose ?
> > >
> > > On Tue, Sep 26, 2017 at 5:16 AM, Marc Gemis <marc.ge...@gmail.com>
> wrote:
> > > > what about a Maproulette task ?
> > > >
> > > > On Tue, Sep 26, 2017 at 5:11 AM, Yuri Astrakhan <
> yuriastrak...@gmail.com>
> > > wrote:
> > > >> At the moment, there are nearly 40,000 OSM objects whose wikipedia
> tag
> > > does
> > > >> not match their wikidata tag. Most of them are Wikipedia redirects,
> > > whose
> > > >> target is the right wikipedia article. If we are not ready to
> abandon
> > > >> wikipedia tags just yet (I don't think we should ATM), I think we
> > > should fix
> > > >> them.  Fixing them by hand seems like a huge waste of the community
> > > time,
> > > >> when it can be semi-automated.
> > > >>
> > > >> I propose that a small program, possibly a plugin to JOSM, would
> change
> > > >> wikipedia tags to point to the target article instead of the
> redirect.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >> ___
> > > >> talk mailing list
> > > >> talk@openstreetmap.org
> > > >> https://lists.openstreetmap.org/listinfo/talk
> > > >>
> > >
>
> > ___
> > talk mailing list
> > talk@openstreetmap.org
> > https://lists.openstreetmap.org/listinfo/talk
>
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Fixing OSM wikipedia redirects

2017-09-26 Thread Yuri Astrakhan
Mark, these do rarely happen, but do they add value?  Having data that
points to a non-(machine)-verifiable redirect, a redirect that could be
deleted or changed at any moment is very fragile.

I think these should be resolved to the target, and treated as I described
in [1] - links to wikipedia pages about multiple objects.  This way it will
be clear that the tag points to a subsection/component of an article, and
will be treated accordingly.

[1]
https://wiki.openstreetmap.org/wiki/Wikipedia_Improvement_Tasks#Links_to_Wikipedia_pages_about_multiple_objects

On Tue, Sep 26, 2017 at 2:58 AM, Mark Wagner <mark+...@carnildo.com> wrote:

> On Mon, 25 Sep 2017 23:11:52 -0400
> Yuri Astrakhan <yuriastrak...@gmail.com> wrote:
>
> > At the moment, there are nearly 40,000 OSM objects whose wikipedia
> > tag does not match their wikidata tag. Most of them are Wikipedia
> > redirects, whose target is the right wikipedia article.
>
> What about the ones where the article is a Wikipedia redirect, whose
> target is *almost, but not quite* the right Wikipedia article?  I'm
> thinking about things like neighborhoods of a city, where Wikipedia
> currently has a redirect to the city article.
>
> --
> Mark
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Adding wikidata tags to the remaining objects with only wikipedia tag

2017-09-25 Thread Yuri Astrakhan
Marc, thanks.  I was under the assumption that talk is the global community
- as it is the most generic in the list, unlike talk-us and
talk-us-newyork. Does it meany that any global proposal would require
talking to hundreds of communities independently, making it impossible to
coordinate, because comments in one community would not be visible to other
communities? Is there any kind of ambassadorial program?  Also, does it
mean that talk-us doesn't decide anything because there is a
talk-us-newyork?

In this specific case, adding wikidata seemed like a long overdue task,
something that is already happening automatically by the unmonitored iD
feature.

Btw, I looked at the descriptions at
https://lists.openstreetmap.org/listinfo

On Mon, Sep 25, 2017 at 11:14 PM, Marc Gemis  wrote:

> > moving it here.  I believe I acted in good faith according to the
> mechanical
> > edit policy - discussed with the community, and proceeded.
>
> I believe the mechanical edit polity demands that you discuss with the
> *local* community. That means if your edit modifies items in e.g.
> Mexico, Belgium and Japan, you have to discuss your edit with the
> communities in Mexico, Belgium and Japan. This might also mean that
> you have to discuss it via Telegram, Facebook, email, IRC, etc.
> depending on where that local community is.
>
> The talk mailing list is not sufficient.
>
> regards
>
> m.
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Label language on the Default stylesheet

2017-09-25 Thread Yuri Astrakhan
>
> And vice versa: I always wonder how usable a map in Latin alphabet is for
> Chinese or Russian speakers.


Cannot speak for Chinese, but in Russia, Latin alphabet was taught at the
very early age in school. I think that drawing a map with local names in
Latin font should not cause too many problems.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] OSM Wikidata SPARQL service updated

2017-08-21 Thread Yuri Astrakhan
Sarah, thanks, I created an issue at
https://github.com/osmcode/pyosmium/issues/47

Does this mean I cannot even use the existing node cache file when
processing ways from the minute diff files from pyosmium?

On Mon, Aug 21, 2017 at 4:44 PM, Sarah Hoffmann <lon...@denofr.de> wrote:

> On Sun, Aug 20, 2017 at 11:08:03PM -0400, Yuri Astrakhan wrote:
> > Sarah, how would I set the node cache file to the repserv.apply_diffs()?
> > The idx param is passed to the apply_file() - for the initial PBF dump
> > parsing, but I don't see any place to pass it for the subsequent diff
> > processing.  I assume there must be a way to run .apply_diff() that will
> > download the minute diff file, update node cache file with the changed
> > nodes, and afterwards call my way handler with the updated way
> geometries.
>
> I don't think that is possible yet. For my own projects I have always
> used an explicit instance of the node cache file and read and written
> that manually (using the osmium.index.LocationTable() class). But that
> is not particularly practical. I'll look into adding an idx parameter
> to the replication mechanism when I find a minute. Feel free to open
> a feature request on github to remind me.
>
> Kind regards
>
> Sarah
>
> >
> > Also, I assume you meant dense_file_array, not dense_file_cache. So in my
> > case I would use one of these idx values when calculating way centroid,
> and
> > None otherwise:
> > sparse_mem_array
> > dense_mmap_array
> > sparse_file_array,my_cache_file
> > dense_file_array,my_cache_file
> >
> > Thanks!
> >
> >
> > On Mon, Aug 14, 2017 at 4:31 PM, Sarah Hoffmann <lon...@denofr.de>
> wrote:
> >
> > > On Mon, Aug 14, 2017 at 11:10:39AM -0400, Yuri Astrakhan wrote:
> > > > mmd, the centroids are calculated with this code, let me know if
> there
> > > is a
> > > > better way, I wasn't aware of any issues with the minute data
> updates.
> > > >   wkb = wkbfab.create_linestring(obj)
> > > >   point = loads(wkb, hex=True).representative_point()
> > > > https://github.com/nyurik/osm2rdf/blob/master/osm2rdf.py#L250
> > >
> > > It doesn't look like you have any location cache included when
> > > processing updates, so that's unlikely to work.
> > >
> > > Minutely updates don't have the full node location information.
> > > If a way gets updated, you only get the new list of node ids.
> > > If the nodes have not changed themselves, they are not available
> > > with the update.
> > >
> > > If you need location information, you need to keep a persistent
> > > node cache in a file (idx=dense_file_cache,file.nodecache)
> > > and use that in your updates as well. It needs to be updated
> > > with the fresh node locations from the minutely change files
> > > and it is used to fill the coordinates for the ways.
> > >
> > > Once you have the node cache, you can get the geometries for
> > > updates ways. This is still only half the truth. If a node in
> > > a way is moved around, then this will naturally change the
> > > geometry of the way, but the minutely change file will have
> > > no indication that the way changed. Normally, these changes are
> > > relatively small and for some applications it is good enough
> > > to ignore them (Nominatim, the search engine, does so, for example).
> > > If you need to catch that case, then you also need to keep a
> > > persistent reverse index of which node is part of which way
> > > and for each changed node, update the ways it belongs to.
> > > There is currently no support for this in libosmium/pyosmium.
> > > So you would need to implement this yourself somehow.
> > >
> > > Kind regards
> > >
> > > Sarah
> > >
> > > >
> > > > Your query is correct, and you are right that (in theory) there
> shouldn't
> > > > be any ways without the center point. But there has been a number of
> ways
> > > > with only 1 point, causing a parsing error "need at least two points
> for
> > > > linestring". I will need to add some special handling for that
> > > > (suggestions?).
> > > >
> > > > You can see the error by adding this line:
> > > >OPTIONAL { ?osmId osmm:loc:error ?err . }
> > > > The whole query --  http://tinyurl.com/ydf4qd62  (you can create
> short
> > > urls
> > > > with a button on the left side)
> > > >
> > > > On 

Re: [OSM-talk] OSM Wikidata SPARQL service updated

2017-09-03 Thread Yuri Astrakhan
OSM+WD service updates:  new examples interface contains just the OSM-related
examples, and they are user-contributable. The osmm:loc (centroid) is now
stored with all objects including relations, so it is now easy to see how
far Wikidata's coordinates are from OSM's - http://tinyurl.com/yd97qtp2 Also,
if a query outputs geo location, it can be shown on an interactive map,
e.g. a map of educational places near Jersey City, with different colors by
type: http://tinyurl.com/y82w6my8 .

Relation members are now stored as  "osmm:has" predicates, linking to the
member object.  Example:   "osmrel:123  osmm:has  osmway:456" -- relation
#123 contains way #456.  The role (inner, outer, ...) of that member is now
stored as   osmrel:123  osmway:456  "inner"   -- meaning relation #123 has
an "inner" way #456 member.  This way you can quickly search for all
relation members of an object -- { ?osmid  osmm:has  ?member . }, or you
can examine the actual role of those members.

Developers, please help with integrating this new engine into MapRoulette
and JOSM.  Also, the service is still looking for a new permanent home, but
there is hope!  http://88.99.164.208/wikidata

Sarah, thanks for your help! I'm now able to calculate centroids for almost
all of the objects. There are still a few broken way objects out there -
they are not stored as linestrings, so i cannot access them via python's
bindings, but there are very few of them.  The data will soon regenerate
with most of the osmm:loc populated.
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] A thought on bot edits

2017-10-05 Thread Yuri Astrakhan
I like the "bot=no" flag, or a more specific one for a given field -
"name:en:bot=no" - as long as those flags are not added by a bot :)

Would it make sense, judging how wikidata* tags have been mostly auto-added
by iD, as well as user's bot efforts, including my own, to treat wikidata
explicitly as a bot tag?  In a way, it is already being treated as such by
many - why not make it official?

On Tue, Oct 3, 2017 at 4:55 AM Christoph Hormann  wrote:

> On Tuesday 03 October 2017, Frederik Ramm wrote:
> > Did your proposal also extend to geoemtries? You said something about
> > bot:* tags, but if a bot were to orthogonalize an existing building,
> > would it then have to create a copy of that tagged
> > "bot:building=yes"? And how could that be differentiated from a
> > building that originally had building=YES and the bot only lowercased
> > the tag value?
>
> My original idea was only about tags but it could be extended to
> geometries of course - as i sketched in my reply to Martin, which would
> essentially mean creating a copy for the building a bot orthogonalizes
> if the building already has a manual building=yes tag.  If the bot only
> changes the tag the building would remain a normal hand mapped geometry
> but would get a bot:building=yes in addition to the building=YES.
>
> Of course duplicating geometry data would make it much more difficult
> for data users to make decisions about selectively using data and it
> would make it much more difficult for editors to allow mappers to edit
> the data correctly.  This is why i originally suggested this only for
> tags - after all the vast majority of bot edits are tag modifications
> only, geometry edits by bots are technically much more complicated to
> do right so they happen less frequently.
>
> As already said - if this approach is not considered favorably it is
> always possible to use the other method and forbid bots to touch
> anything with a bot=no tag and thereby allow mappers to opt out of bot
> edits on a case-by-case basis.
>
> --
> Christoph Hormann
> http://www.imagico.de/
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Licence compatibility (was Adding wikidata tags to the remaining objects with only wikipedia tag)

2017-10-03 Thread Yuri Astrakhan
Thanks Richard. Understood about coordinates. I suspect that most of the
Wikidata+OSM value is not related to Wikipedia's geolocations, but rather
multilingual names, object classifications, links to multiple Wikipedia
languages, and an ability to query connected graph data.  To my knowledge,
these are the main usecases by our data consumers.

On Tue, Oct 3, 2017 at 3:25 AM Richard Fairhurst <rich...@systemed.net>
wrote:

> Yuri Astrakhan wrote:
> OpenStreetMap takes and has always taken a whiter-than-white view of
> copyright. We aim to provide a dataset that anyone can use without fear of
> legal repercussions. It is not OSM's role to explore interesting grey areas
> in copyright, nor to push things to the extent that a court case is
> necessary.
>
> It has been settled for many years that we do not take co-ordinates from
> Wikipedia. They are mostly encumbered with Google copyright:
>
>
> https://en.wikipedia.org/wiki/Wikipedia:Obtaining_geographic_coordinates#Google_tools
>
> This is not a new issue and has been mentioned before in connection with
> Wikidata referencing.
>
> Our data is principally hosted in the UK and the OSM Foundation is a
> company
> registered in England & Wales, so as a broad assumption UK law applies
> (which is fairly maximalist on copyright and follows the sweat-of-the-brow
> doctrine) as well as the EU database right, at least until this benighted
> country takes leave of its senses forever and leaves the EU. :(
>
> Follow-ups probably best to legal-talk@.
>
> Richard
>
>
>
> --
> Sent from: http://gis.19327.n8.nabble.com/General-Discussion-f5171242.html
>
> ___
> talk mailing list
> talk@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/talk
>
___
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk


  1   2   3   >