Re: [Talk-ca] [OSM-talk] Redacting 75, 000 street names contributed by user chdr

2017-08-27 Thread Stewart C. Russell
I agree with John that many ways flagged by Frederik look like they are
legitimate CanVec imports. In a random sampling of chdr's flagged ways
in Canada, fewer than 15% were created by that user. Some had existing
names cleaned up (eg: Libersan → Rue Libersan in way 23456048) by chdr.
Perhaps more dodgy are the ones where chdr added a name to an existing
way where none had been before, as there is no change in source tagging
in chdr's version.

Many of chdr's ways have been deleted and replaced by other imports by
other users (see changeset 2386572 for a good example) that reused the
same way ID. So we can't delete these.

Some on Frederik's list (such as way 27877549) weren't named by chdr,
either. So those should stay, too.

The criteria for clearing up chdir's edits in Canada needs to be
tightened up a lot before it is implemented.

 Stewart


On 2017-08-27 10:26 AM, john whelan wrote:
> In Canada as James has said CANVEC which has been accepted as Open
> Source acceptable to OSM has most street names in Canada.  There are a
> few exceptions locally where the city has renamed streets and these
> changes have not yet been reflected in CANVEC.
> 
> I would suggest that any street names added by chdr in Canada were more
> than likely derived from CANVEC sources thus it is extremely unlikely
> that anyone would claim copyright on them.  I am aware of the issues
> involved in respecting copyright.
> 
> Perhaps other Canadian mappers may have some thoughts, although with a
> todo list in JOSM we could probably repair the damage fairly quickly.
> 
> Cheerio John
> 
> 
> 
> On 27 August 2017 at 09:58, James  > wrote:
> 
> If we validate via survey say in Canada, will we be able to remove
> the id from the revert list? Canada has Canvec we can reference to
> as well as OpenStreetCam and Mapillary
> 
> On Aug 27, 2017 9:50 AM, "Frederik Ramm"  > wrote:
> 
> Hi,
> 
>in 2010 I was privately contacted by another OSM user with the
> suspicion that user "chdr" might be copying names from Google maps
> (there were few "easter eggs" in Oman that were only on Google
> and not
> in the real world, and they suddenly popped up on OSM). "chdr" was
> contacted at the time, but continued unfazed. In 2013 another mapper
> lodged a complaint with DWG about edits by chdr, and I emailed chdr
> asking him about his sources. At that point chdr stopped mapping. He
> never replied about his sources though, even when I set an
> ultimatum (of
> 31st August 2013) threatening to remove all names he contributed
> if he
> can't tell us his source. We do have to assume that all names
> contributed by chdr are copyright violations.
> 
> (chdr has added names all around the world, making a harmless survey
> unlikely.)
> 
> For various reasons I neglected to act on this, and was only
> reminded
> now, 5 years later, when DWG received a complaint from a user in
> Brazil
> where chdr has even used "source=google" occasionally. (But as I
> said,
> the suspicion is that Google was used throughout.)
> 
> I have now compiled a list of all street names that were
> contributed by
> chdr and are still visible today; we're talking about almost 75,000
> street names world wide. The most affected countries are:
> 
>   18023 "United States of America"
>   16345 "Mexico"
>   15109 "Brazil"
>6791 "RSA"
>2802 "Spain"
>2614 "Australia"
>1923 "Argentina"
>1673 "Nigeria"
>1569 "India"
>1441 "Canada"
> 954 "Malaysia"
> 744 "Botswana"
> 717 "Philippines"
> 619 "Indonesia"
> 553 "Italy"
> 414 "Turkey"
> 290 "Hungary"
> 284 "Chile"
> 250 "Kenya"
> 127 "Saudi Arabia"
> 107 "Paraguay"
> 106 "Panama"
> 100 "Morocco"
> 
> I've left out those countries with less than 100 affected ways.
> 
> For the US, I can break it down by state:
> 
>5696 "Arizona"
>5116 "Texas"
>2294 "New York"
>1164 "District of Columbia"
> 740 "Iowa"
> 494 "Colorado"
> 416 "New Jersey"
> 339 "Illinois"
> 268 "Michigan"
> 239 "Pennsylvania"
> 181 "Missouri"
> 147 "Georgia"
> 129 "New Mexico"
> 123 "North Carolina"
> 115 "California"
> 106 "Virginia"
> 
> The breakdown for Mexico:
> 
>7749 "Baja California"
>2084 "Puebla"
>1964 "Chihuahua"
>

Re: [Talk-ca] [OSM-talk] Redacting 75, 000 street names contributed by user chdr

2017-08-27 Thread john whelan
In Canada as James has said CANVEC which has been accepted as Open Source
acceptable to OSM has most street names in Canada.  There are a few
exceptions locally where the city has renamed streets and these changes
have not yet been reflected in CANVEC.

I would suggest that any street names added by chdr in Canada were more
than likely derived from CANVEC sources thus it is extremely unlikely that
anyone would claim copyright on them.  I am aware of the issues involved in
respecting copyright.

Perhaps other Canadian mappers may have some thoughts, although with a todo
list in JOSM we could probably repair the damage fairly quickly.

Cheerio John



On 27 August 2017 at 09:58, James  wrote:

> If we validate via survey say in Canada, will we be able to remove the id
> from the revert list? Canada has Canvec we can reference to as well as
> OpenStreetCam and Mapillary
>
> On Aug 27, 2017 9:50 AM, "Frederik Ramm"  wrote:
>
>> Hi,
>>
>>in 2010 I was privately contacted by another OSM user with the
>> suspicion that user "chdr" might be copying names from Google maps
>> (there were few "easter eggs" in Oman that were only on Google and not
>> in the real world, and they suddenly popped up on OSM). "chdr" was
>> contacted at the time, but continued unfazed. In 2013 another mapper
>> lodged a complaint with DWG about edits by chdr, and I emailed chdr
>> asking him about his sources. At that point chdr stopped mapping. He
>> never replied about his sources though, even when I set an ultimatum (of
>> 31st August 2013) threatening to remove all names he contributed if he
>> can't tell us his source. We do have to assume that all names
>> contributed by chdr are copyright violations.
>>
>> (chdr has added names all around the world, making a harmless survey
>> unlikely.)
>>
>> For various reasons I neglected to act on this, and was only reminded
>> now, 5 years later, when DWG received a complaint from a user in Brazil
>> where chdr has even used "source=google" occasionally. (But as I said,
>> the suspicion is that Google was used throughout.)
>>
>> I have now compiled a list of all street names that were contributed by
>> chdr and are still visible today; we're talking about almost 75,000
>> street names world wide. The most affected countries are:
>>
>>   18023 "United States of America"
>>   16345 "Mexico"
>>   15109 "Brazil"
>>6791 "RSA"
>>2802 "Spain"
>>2614 "Australia"
>>1923 "Argentina"
>>1673 "Nigeria"
>>1569 "India"
>>1441 "Canada"
>> 954 "Malaysia"
>> 744 "Botswana"
>> 717 "Philippines"
>> 619 "Indonesia"
>> 553 "Italy"
>> 414 "Turkey"
>> 290 "Hungary"
>> 284 "Chile"
>> 250 "Kenya"
>> 127 "Saudi Arabia"
>> 107 "Paraguay"
>> 106 "Panama"
>> 100 "Morocco"
>>
>> I've left out those countries with less than 100 affected ways.
>>
>> For the US, I can break it down by state:
>>
>>5696 "Arizona"
>>5116 "Texas"
>>2294 "New York"
>>1164 "District of Columbia"
>> 740 "Iowa"
>> 494 "Colorado"
>> 416 "New Jersey"
>> 339 "Illinois"
>> 268 "Michigan"
>> 239 "Pennsylvania"
>> 181 "Missouri"
>> 147 "Georgia"
>> 129 "New Mexico"
>> 123 "North Carolina"
>> 115 "California"
>> 106 "Virginia"
>>
>> The breakdown for Mexico:
>>
>>7749 "Baja California"
>>2084 "Puebla"
>>1964 "Chihuahua"
>>1539 "Coahuila"
>>1161 "Mexico"
>>1040 "Chiapas"
>> 342 "Tamaulipas"
>> 241 "Sonora"
>> 185 "San Luis Potosi"
>> 129 "New Mexico"
>>
>> and Brazil:
>>
>>   10904 "São Paulo"
>>2605 "Paraná"
>> 945 "Rio de Janeiro"
>> 270 "Rio Grande do Sul"
>> 154 "Goiás"
>>
>> and South Africa:
>>
>>4422 "Gauteng"
>> 750 "KwaZulu-Natal"
>> 600 "Eastern Cape"
>> 439 "Western Cape"
>> 400 "Northern Cape"
>> 179 "Mpumalanga"
>>
>> - each time leaving out a couple others under 100.
>>
>> We believe that only names, not geometries have been taken from other
>> maps so we'll remove and redact the names only. In identifying "names
>> contributed by chdr" I took care to really only pick up the names that
>> were introduced by them, not names that were there before, and also when
>> chdr split a way that had a name I will make sure that the newly created
>> way doesn't count as "named by chdr". Additionally, I have ignored those
>> cases where chdr simply performed a TIGER expansion (St->Street etc) of
>> a name that was there before.
>>
>> My process has two weak points (that I am aware of):
>>
>> 1. It doesn't properly "follow" a chrdr-contributed name through way
>> splits performed by other users; if someone has split a way created by
>> chdr, then the name will remain on the bit that was created by this
>> user. This is somewhat unsatisfying but after having manually checked a
>> random sample I think the problem is small enough to be ignored.
>>
>> 2. It is possible that, like with a recent