[Talk-ca] OSM data quality in Canada

2015-06-17 Thread Martijn van Exel
Hello list — 

My name is Martijn van Exel, I am on the OSM US board and work at Telenav. I’ve 
written to this list a few times before, but this time I am doing so with my 
Telenav hat on. Perhaps you know that we have the Scout apps (iOS, Android) 
which run on OSM data. (If you haven’t yet, please give Scout a try some time 
and let me know what you think!)

We are always looking into ways to make significant contributions to OSM, in 
the US, Canada and elsewhere. We’re starting to look into Canada more, and I 
could really use your help with a few key questions:

* What is the imports history, particularly in relation to road network, POIs 
and addresses? (Beyond what’s in the import catalogue page on the wiki, if 
anything)
* What external (government and otherwise) open geospatial data sources are out 
there that have been or may be considered for improving OSM?
* Are there any Canada-specific mapping and tagging conventions?
* Are there any known big (national) issues in the Canadian OSM data? 
(misguided imports / bots, major tagging disputes, that kind of thing)
* Which (other) companies / organizations / government agencies use OSM data 
for Canada?
* Any suggestions for QA tools that would help the community, either existing 
or new?

I’m happy to discuss on-list or off. Thanks!

Martijn
___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Martijn van Exel
Unrelated, but I noticed that talk-ca is not archived on Nabble yet - this 
makes it hard to share and follow a conversation as a non-subscriber. I don’t 
know what’s involved in adding this list or if anyone would object?

Martijn

 On Jun 17, 2015, at 4:47 PM, Martijn van Exel m...@rtijn.org wrote:
 
 Hi Andrew, 
 
 Thanks for elaborating on the CanVec / Geobase imports! This also raises new 
 questions.. See below.
 
 On Jun 17, 2015, at 3:00 PM, Andrew MacKinnon andrew...@gmail.com wrote:
 
 A lot of the data in Canada was imported from CanVec and Geobase,
 some of it by me several years ago. The imported data is pretty poor
 quality in many places. I haven't done much work on this recently, as
 imports have a bad reputation in OSM and I am mostly concerned with
 surveying. For example:
 
 - Some older road data comes from an import which combined CanVec and
 Statistics Canada road names, attempting to match the road names in
 Statistics Canada with roads without names from CanVec, and this data
 is poor quality.
 
 Is this described in more detail anywhere? Are the data / scripts / process 
 still available? Which dat was poor quality, CanVec or Statistics Canada?
 
 - Road data in some areas is missing entirely.
 
 This is probably easy to visualize, but do you happen to know where / why?
 
 - The CanVec address data is low quality, and is often broken - e.g.
 on a tile boundary address ranges will be split in half, and comes
 from several different versions of CanVec.
 - Other CanVec layers such as woods, lakes and so on were imported in
 some areas but not others. Much of this data is low quality.
 
 Was some sort of progress page kept so we could see where certain features 
 were imported or not (yet)? Has a followup ever been considered to augment / 
 fix these botched / low quality imports? 
 
 - Some road names have too many spaces e.g. John Street is John
 Street. Some address ranges are like that as well.
 - lanes=-1 and surface=unpaved for roads that are really paved in Quebec.
 - Better quality municipal GIS datasets are now available in some
 cities like Toronto, Peel Region and York Region and if they are
 properly licensed, these should be used whenever possible. There
 generally are some minor errors in these datasets, but they are far
 better quality than CanVec/Geobase.
 
 Ah, interesting. Is there already a list of these candidates or would it make 
 sense to start one and look into proper licensing?
 
 
 I really like the TO-FIX Tiger Delta layer at
 http://osmlab.github.io/to-fix/#/task/tigerdelta which matches TIGER
 data with OSM data and tries to find errors. It would helpful if a
 similar tool were created for Canada.
 
 Obviously I am partial to MapRoulette, but sure, let me check it out, I am 
 sure we can come up with something similar for Canada. What would the 
 reference data be instead of TIGER?
 
 Again, thanks for your insights, Andrew.
 
 Martijn


___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Andrew MacKinnon
See http://wiki.openstreetmap.org/wiki/CanVec. CanVec data was
converted to OSM format and is stored at
http://ftp2.cits.rncan.gc.ca/OSM/pub/, and is split into files based
on the National Topographic System, and then data was imported in some
parts of Canada by manually cutting and pasting data from these files
into JOSM. I did this in a large part of southern Ontario and some
other users have done this as well. Importing CanVec data this way and
correcting all the errors is tedious and hasn't been completed for all
of Canada, and I haven't done very much with this for several years.
Before this was done there were more primitive imports done, perhaps
around 2008-2009 or so, and these imports are extremely low quality. I
can't remember which OSM user did this. When OSM was new there was not
much data in OSM, so a lot of imports were done and many of these
imports were poor quality; now that OSM is more mature, imports are
increasingly viewed unfavourably and there is a general attitude that
data should be collected by surveying whenever possible.

It would probably be best to use the newest version of the Geobase
National Road Network (http://www.geobase.ca/) and compare this to the
data in OSM and make corrections that way. Keep in mind that this data
has errors and municipal datasets (where available) are always better
quality.

___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Andrew MacKinnon
Also see Ordinance Survey Locator Musical Chairs
http://wiki.openstreetmap.org/wiki/OS_Locator_Musical_Chairs and
http://ris.dev.openstreetmap.org/oslmusicalchairs/map for a
comparison tool comparing UK Ordinance Survey data with OSM data,
similar to the TIGER fixup tool.

On Wed, Jun 17, 2015 at 7:10 PM, Andrew MacKinnon andrew...@gmail.com wrote:
 See http://wiki.openstreetmap.org/wiki/CanVec. CanVec data was
 converted to OSM format and is stored at
 http://ftp2.cits.rncan.gc.ca/OSM/pub/, and is split into files based
 on the National Topographic System, and then data was imported in some
 parts of Canada by manually cutting and pasting data from these files
 into JOSM. I did this in a large part of southern Ontario and some
 other users have done this as well. Importing CanVec data this way and
 correcting all the errors is tedious and hasn't been completed for all
 of Canada, and I haven't done very much with this for several years.
 Before this was done there were more primitive imports done, perhaps
 around 2008-2009 or so, and these imports are extremely low quality. I
 can't remember which OSM user did this. When OSM was new there was not
 much data in OSM, so a lot of imports were done and many of these
 imports were poor quality; now that OSM is more mature, imports are
 increasingly viewed unfavourably and there is a general attitude that
 data should be collected by surveying whenever possible.

 It would probably be best to use the newest version of the Geobase
 National Road Network (http://www.geobase.ca/) and compare this to the
 data in OSM and make corrections that way. Keep in mind that this data
 has errors and municipal datasets (where available) are always better
 quality.

___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Martijn van Exel
Hi Andrew, 

Thanks for elaborating on the CanVec / Geobase imports! This also raises new 
questions.. See below.

 On Jun 17, 2015, at 3:00 PM, Andrew MacKinnon andrew...@gmail.com wrote:
 
 A lot of the data in Canada was imported from CanVec and Geobase,
 some of it by me several years ago. The imported data is pretty poor
 quality in many places. I haven't done much work on this recently, as
 imports have a bad reputation in OSM and I am mostly concerned with
 surveying. For example:
 
 - Some older road data comes from an import which combined CanVec and
 Statistics Canada road names, attempting to match the road names in
 Statistics Canada with roads without names from CanVec, and this data
 is poor quality.

Is this described in more detail anywhere? Are the data / scripts / process 
still available? Which dat was poor quality, CanVec or Statistics Canada?

 - Road data in some areas is missing entirely.

This is probably easy to visualize, but do you happen to know where / why?

 - The CanVec address data is low quality, and is often broken - e.g.
 on a tile boundary address ranges will be split in half, and comes
 from several different versions of CanVec.
 - Other CanVec layers such as woods, lakes and so on were imported in
 some areas but not others. Much of this data is low quality.

Was some sort of progress page kept so we could see where certain features were 
imported or not (yet)? Has a followup ever been considered to augment / fix 
these botched / low quality imports? 

 - Some road names have too many spaces e.g. John Street is John
 Street. Some address ranges are like that as well.
 - lanes=-1 and surface=unpaved for roads that are really paved in Quebec.
 - Better quality municipal GIS datasets are now available in some
 cities like Toronto, Peel Region and York Region and if they are
 properly licensed, these should be used whenever possible. There
 generally are some minor errors in these datasets, but they are far
 better quality than CanVec/Geobase.

Ah, interesting. Is there already a list of these candidates or would it make 
sense to start one and look into proper licensing?

 
 I really like the TO-FIX Tiger Delta layer at
 http://osmlab.github.io/to-fix/#/task/tigerdelta which matches TIGER
 data with OSM data and tries to find errors. It would helpful if a
 similar tool were created for Canada.

Obviously I am partial to MapRoulette, but sure, let me check it out, I am sure 
we can come up with something similar for Canada. What would the reference data 
be instead of TIGER?

Again, thanks for your insights, Andrew.

Martijn
___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Andrew MacKinnon
 A lot of the data in Canada was imported from CanVec and Geobase,
some of it by me several years ago. The imported data is pretty poor
quality in many places. I haven't done much work on this recently, as
imports have a bad reputation in OSM and I am mostly concerned with
surveying. For example:

- Some older road data comes from an import which combined CanVec and
Statistics Canada road names, attempting to match the road names in
Statistics Canada with roads without names from CanVec, and this data
is poor quality.
- Road data in some areas is missing entirely.
- The CanVec address data is low quality, and is often broken - e.g.
on a tile boundary address ranges will be split in half, and comes
from several different versions of CanVec.
- Other CanVec layers such as woods, lakes and so on were imported in
some areas but not others. Much of this data is low quality.
- Some road names have too many spaces e.g. John Street is John
Street. Some address ranges are like that as well.
- lanes=-1 and surface=unpaved for roads that are really paved in Quebec.
- Better quality municipal GIS datasets are now available in some
cities like Toronto, Peel Region and York Region and if they are
properly licensed, these should be used whenever possible. There
generally are some minor errors in these datasets, but they are far
better quality than CanVec/Geobase.

I really like the TO-FIX Tiger Delta layer at
http://osmlab.github.io/to-fix/#/task/tigerdelta which matches TIGER
data with OSM data and tries to find errors. It would helpful if a
similar tool were created for Canada.

On Wed, Jun 17, 2015 at 4:27 PM, Harald Kliems kli...@gmail.com wrote:
 A few things I can think of:

 On Wed, Jun 17, 2015 at 3:13 PM Martijn van Exel m...@rtijn.org wrote:

 * Are there any Canada-specific mapping and tagging conventions?

 - There seems to be a strong consensus that what elsewhere would be
 highway=unclassified is highway=residential, no matter if the road is in a
 populated area or not.

 * Are there any known big (national) issues in the Canadian OSM data?
 (misguided imports / bots, major tagging disputes, that kind of thing)

 I believe these mostly affect Quebec, but there are two import problems that
 never got systematically fixed, as far as I know:
 - CanVec import of highways where lanes=-1 and surface=unpaved.
 - CanVec or Geobase import where there is an extra blank between the street
 type designation and the name. E.g. Rue__Sherbrooke instead of
 Rue_Sherbrooke.

  Harald (now in the US)

 ___
 Talk-ca mailing list
 Talk-ca@openstreetmap.org
 https://lists.openstreetmap.org/listinfo/talk-ca


___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Andrew Lester
If this is the consensus, I've been blissfully unaware and the wiki needs to be 
updated. The Canadian tagging guidelines 
(https://wiki.openstreetmap.org/wiki/Canadian_tagging_guidelines#Regional_roadways_.28below_provincially_controlled.29)
 recommend using unclassified when not in residential areas, and that's how 
I've been tagging. The CANVEC imports generally use residential as you describe 
which has led to a lot of mis-tagged highways, but I wouldn't say this is a 
consensus agreement that this is how we want it to be. It’s just how the data 
was imported. I'm gradually re-tagging such highways in my area, but there's a 
lot that need to be fixed across very large areas and not many people working 
on it.

 

Andrew Lester

Victoria, BC, Canada

 

From: Harald Kliems [mailto:kli...@gmail.com] 
Sent: Wednesday, June 17, 2015 1:27 PM
To: Martijn van Exel; talk-ca@openstreetmap.org
Subject: Re: [Talk-ca] OSM data quality in Canada

 

A few things I can think of:

 

On Wed, Jun 17, 2015 at 3:13 PM Martijn van Exel m...@rtijn.org 
mailto:m...@rtijn.org  wrote:

* Are there any Canada-specific mapping and tagging conventions?

- There seems to be a strong consensus that what elsewhere would be 
highway=unclassified is highway=residential, no matter if the road is in a 
populated area or not.

___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca


Re: [Talk-ca] OSM data quality in Canada

2015-06-17 Thread Harald Kliems
A few things I can think of:

On Wed, Jun 17, 2015 at 3:13 PM Martijn van Exel m...@rtijn.org wrote:

 * Are there any Canada-specific mapping and tagging conventions?

- There seems to be a strong consensus that what elsewhere would be
highway=unclassified is highway=residential, no matter if the road is in a
populated area or not.

* Are there any known big (national) issues in the Canadian OSM data?
 (misguided imports / bots, major tagging disputes, that kind of thing)

I believe these mostly affect Quebec, but there are two import problems
that never got systematically fixed, as far as I know:
- CanVec import of highways where lanes=-1 and surface=unpaved.
- CanVec or Geobase import where there is an extra blank between the street
type designation and the name. E.g. Rue__Sherbrooke instead of
Rue_Sherbrooke.

 Harald (now in the US)
___
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca