[Talk-ca] OSM data quality in Canada
Hello list — My name is Martijn van Exel, I am on the OSM US board and work at Telenav. I’ve written to this list a few times before, but this time I am doing so with my Telenav hat on. Perhaps you know that we have the Scout apps (iOS, Android) which run on OSM data. (If you haven’t yet, please give Scout a try some time and let me know what you think!) We are always looking into ways to make significant contributions to OSM, in the US, Canada and elsewhere. We’re starting to look into Canada more, and I could really use your help with a few key questions: * What is the imports history, particularly in relation to road network, POIs and addresses? (Beyond what’s in the import catalogue page on the wiki, if anything) * What external (government and otherwise) open geospatial data sources are out there that have been or may be considered for improving OSM? * Are there any Canada-specific mapping and tagging conventions? * Are there any known big (national) issues in the Canadian OSM data? (misguided imports / bots, major tagging disputes, that kind of thing) * Which (other) companies / organizations / government agencies use OSM data for Canada? * Any suggestions for QA tools that would help the community, either existing or new? I’m happy to discuss on-list or off. Thanks! Martijn ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
Unrelated, but I noticed that talk-ca is not archived on Nabble yet - this makes it hard to share and follow a conversation as a non-subscriber. I don’t know what’s involved in adding this list or if anyone would object? Martijn On Jun 17, 2015, at 4:47 PM, Martijn van Exel m...@rtijn.org wrote: Hi Andrew, Thanks for elaborating on the CanVec / Geobase imports! This also raises new questions.. See below. On Jun 17, 2015, at 3:00 PM, Andrew MacKinnon andrew...@gmail.com wrote: A lot of the data in Canada was imported from CanVec and Geobase, some of it by me several years ago. The imported data is pretty poor quality in many places. I haven't done much work on this recently, as imports have a bad reputation in OSM and I am mostly concerned with surveying. For example: - Some older road data comes from an import which combined CanVec and Statistics Canada road names, attempting to match the road names in Statistics Canada with roads without names from CanVec, and this data is poor quality. Is this described in more detail anywhere? Are the data / scripts / process still available? Which dat was poor quality, CanVec or Statistics Canada? - Road data in some areas is missing entirely. This is probably easy to visualize, but do you happen to know where / why? - The CanVec address data is low quality, and is often broken - e.g. on a tile boundary address ranges will be split in half, and comes from several different versions of CanVec. - Other CanVec layers such as woods, lakes and so on were imported in some areas but not others. Much of this data is low quality. Was some sort of progress page kept so we could see where certain features were imported or not (yet)? Has a followup ever been considered to augment / fix these botched / low quality imports? - Some road names have too many spaces e.g. John Street is John Street. Some address ranges are like that as well. - lanes=-1 and surface=unpaved for roads that are really paved in Quebec. - Better quality municipal GIS datasets are now available in some cities like Toronto, Peel Region and York Region and if they are properly licensed, these should be used whenever possible. There generally are some minor errors in these datasets, but they are far better quality than CanVec/Geobase. Ah, interesting. Is there already a list of these candidates or would it make sense to start one and look into proper licensing? I really like the TO-FIX Tiger Delta layer at http://osmlab.github.io/to-fix/#/task/tigerdelta which matches TIGER data with OSM data and tries to find errors. It would helpful if a similar tool were created for Canada. Obviously I am partial to MapRoulette, but sure, let me check it out, I am sure we can come up with something similar for Canada. What would the reference data be instead of TIGER? Again, thanks for your insights, Andrew. Martijn ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
See http://wiki.openstreetmap.org/wiki/CanVec. CanVec data was converted to OSM format and is stored at http://ftp2.cits.rncan.gc.ca/OSM/pub/, and is split into files based on the National Topographic System, and then data was imported in some parts of Canada by manually cutting and pasting data from these files into JOSM. I did this in a large part of southern Ontario and some other users have done this as well. Importing CanVec data this way and correcting all the errors is tedious and hasn't been completed for all of Canada, and I haven't done very much with this for several years. Before this was done there were more primitive imports done, perhaps around 2008-2009 or so, and these imports are extremely low quality. I can't remember which OSM user did this. When OSM was new there was not much data in OSM, so a lot of imports were done and many of these imports were poor quality; now that OSM is more mature, imports are increasingly viewed unfavourably and there is a general attitude that data should be collected by surveying whenever possible. It would probably be best to use the newest version of the Geobase National Road Network (http://www.geobase.ca/) and compare this to the data in OSM and make corrections that way. Keep in mind that this data has errors and municipal datasets (where available) are always better quality. ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
Also see Ordinance Survey Locator Musical Chairs http://wiki.openstreetmap.org/wiki/OS_Locator_Musical_Chairs and http://ris.dev.openstreetmap.org/oslmusicalchairs/map for a comparison tool comparing UK Ordinance Survey data with OSM data, similar to the TIGER fixup tool. On Wed, Jun 17, 2015 at 7:10 PM, Andrew MacKinnon andrew...@gmail.com wrote: See http://wiki.openstreetmap.org/wiki/CanVec. CanVec data was converted to OSM format and is stored at http://ftp2.cits.rncan.gc.ca/OSM/pub/, and is split into files based on the National Topographic System, and then data was imported in some parts of Canada by manually cutting and pasting data from these files into JOSM. I did this in a large part of southern Ontario and some other users have done this as well. Importing CanVec data this way and correcting all the errors is tedious and hasn't been completed for all of Canada, and I haven't done very much with this for several years. Before this was done there were more primitive imports done, perhaps around 2008-2009 or so, and these imports are extremely low quality. I can't remember which OSM user did this. When OSM was new there was not much data in OSM, so a lot of imports were done and many of these imports were poor quality; now that OSM is more mature, imports are increasingly viewed unfavourably and there is a general attitude that data should be collected by surveying whenever possible. It would probably be best to use the newest version of the Geobase National Road Network (http://www.geobase.ca/) and compare this to the data in OSM and make corrections that way. Keep in mind that this data has errors and municipal datasets (where available) are always better quality. ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
Hi Andrew, Thanks for elaborating on the CanVec / Geobase imports! This also raises new questions.. See below. On Jun 17, 2015, at 3:00 PM, Andrew MacKinnon andrew...@gmail.com wrote: A lot of the data in Canada was imported from CanVec and Geobase, some of it by me several years ago. The imported data is pretty poor quality in many places. I haven't done much work on this recently, as imports have a bad reputation in OSM and I am mostly concerned with surveying. For example: - Some older road data comes from an import which combined CanVec and Statistics Canada road names, attempting to match the road names in Statistics Canada with roads without names from CanVec, and this data is poor quality. Is this described in more detail anywhere? Are the data / scripts / process still available? Which dat was poor quality, CanVec or Statistics Canada? - Road data in some areas is missing entirely. This is probably easy to visualize, but do you happen to know where / why? - The CanVec address data is low quality, and is often broken - e.g. on a tile boundary address ranges will be split in half, and comes from several different versions of CanVec. - Other CanVec layers such as woods, lakes and so on were imported in some areas but not others. Much of this data is low quality. Was some sort of progress page kept so we could see where certain features were imported or not (yet)? Has a followup ever been considered to augment / fix these botched / low quality imports? - Some road names have too many spaces e.g. John Street is John Street. Some address ranges are like that as well. - lanes=-1 and surface=unpaved for roads that are really paved in Quebec. - Better quality municipal GIS datasets are now available in some cities like Toronto, Peel Region and York Region and if they are properly licensed, these should be used whenever possible. There generally are some minor errors in these datasets, but they are far better quality than CanVec/Geobase. Ah, interesting. Is there already a list of these candidates or would it make sense to start one and look into proper licensing? I really like the TO-FIX Tiger Delta layer at http://osmlab.github.io/to-fix/#/task/tigerdelta which matches TIGER data with OSM data and tries to find errors. It would helpful if a similar tool were created for Canada. Obviously I am partial to MapRoulette, but sure, let me check it out, I am sure we can come up with something similar for Canada. What would the reference data be instead of TIGER? Again, thanks for your insights, Andrew. Martijn ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
A lot of the data in Canada was imported from CanVec and Geobase, some of it by me several years ago. The imported data is pretty poor quality in many places. I haven't done much work on this recently, as imports have a bad reputation in OSM and I am mostly concerned with surveying. For example: - Some older road data comes from an import which combined CanVec and Statistics Canada road names, attempting to match the road names in Statistics Canada with roads without names from CanVec, and this data is poor quality. - Road data in some areas is missing entirely. - The CanVec address data is low quality, and is often broken - e.g. on a tile boundary address ranges will be split in half, and comes from several different versions of CanVec. - Other CanVec layers such as woods, lakes and so on were imported in some areas but not others. Much of this data is low quality. - Some road names have too many spaces e.g. John Street is John Street. Some address ranges are like that as well. - lanes=-1 and surface=unpaved for roads that are really paved in Quebec. - Better quality municipal GIS datasets are now available in some cities like Toronto, Peel Region and York Region and if they are properly licensed, these should be used whenever possible. There generally are some minor errors in these datasets, but they are far better quality than CanVec/Geobase. I really like the TO-FIX Tiger Delta layer at http://osmlab.github.io/to-fix/#/task/tigerdelta which matches TIGER data with OSM data and tries to find errors. It would helpful if a similar tool were created for Canada. On Wed, Jun 17, 2015 at 4:27 PM, Harald Kliems kli...@gmail.com wrote: A few things I can think of: On Wed, Jun 17, 2015 at 3:13 PM Martijn van Exel m...@rtijn.org wrote: * Are there any Canada-specific mapping and tagging conventions? - There seems to be a strong consensus that what elsewhere would be highway=unclassified is highway=residential, no matter if the road is in a populated area or not. * Are there any known big (national) issues in the Canadian OSM data? (misguided imports / bots, major tagging disputes, that kind of thing) I believe these mostly affect Quebec, but there are two import problems that never got systematically fixed, as far as I know: - CanVec import of highways where lanes=-1 and surface=unpaved. - CanVec or Geobase import where there is an extra blank between the street type designation and the name. E.g. Rue__Sherbrooke instead of Rue_Sherbrooke. Harald (now in the US) ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
If this is the consensus, I've been blissfully unaware and the wiki needs to be updated. The Canadian tagging guidelines (https://wiki.openstreetmap.org/wiki/Canadian_tagging_guidelines#Regional_roadways_.28below_provincially_controlled.29) recommend using unclassified when not in residential areas, and that's how I've been tagging. The CANVEC imports generally use residential as you describe which has led to a lot of mis-tagged highways, but I wouldn't say this is a consensus agreement that this is how we want it to be. It’s just how the data was imported. I'm gradually re-tagging such highways in my area, but there's a lot that need to be fixed across very large areas and not many people working on it. Andrew Lester Victoria, BC, Canada From: Harald Kliems [mailto:kli...@gmail.com] Sent: Wednesday, June 17, 2015 1:27 PM To: Martijn van Exel; talk-ca@openstreetmap.org Subject: Re: [Talk-ca] OSM data quality in Canada A few things I can think of: On Wed, Jun 17, 2015 at 3:13 PM Martijn van Exel m...@rtijn.org mailto:m...@rtijn.org wrote: * Are there any Canada-specific mapping and tagging conventions? - There seems to be a strong consensus that what elsewhere would be highway=unclassified is highway=residential, no matter if the road is in a populated area or not. ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca
Re: [Talk-ca] OSM data quality in Canada
A few things I can think of: On Wed, Jun 17, 2015 at 3:13 PM Martijn van Exel m...@rtijn.org wrote: * Are there any Canada-specific mapping and tagging conventions? - There seems to be a strong consensus that what elsewhere would be highway=unclassified is highway=residential, no matter if the road is in a populated area or not. * Are there any known big (national) issues in the Canadian OSM data? (misguided imports / bots, major tagging disputes, that kind of thing) I believe these mostly affect Quebec, but there are two import problems that never got systematically fixed, as far as I know: - CanVec import of highways where lanes=-1 and surface=unpaved. - CanVec or Geobase import where there is an extra blank between the street type designation and the name. E.g. Rue__Sherbrooke instead of Rue_Sherbrooke. Harald (now in the US) ___ Talk-ca mailing list Talk-ca@openstreetmap.org https://lists.openstreetmap.org/listinfo/talk-ca