And I totally agree.  Because the Stat Can data has come from many sources the data quality is variable to put it politely.  The Microsoft data has been shown in the US to also be of variable quality.  I'm not so sure about the NR Can LiDAR data hopefully it is at least consistent.

If we look at the history of the project then we can get an idea of how we came to be where we are.

First I wanted to import all the bus stops in Ottawa because only by importing could you ensure you had all the stops with their reference numbers but the City of Ottawa Open Data license did not align with OSM.  I was also talking to Treasury Board and explained to them that their Open Data license version 1 didn't align with OSM so we couldn't use their data.  Five years later TB released their Open Data license version 2 which they felt did align.

Stat Can has two types of project, pilot ones and ones that earn money. The original pilot was based on Ottawa and Gatineau and was for two years.  Their original plan was mapathons using iD.  I was impressed when a stat can employee managed to accurately map a building using iD during a presentation.  I hadn't thought it was possible after some of the efforts I'd seen in HOT mapping.   Fine except that it requires a lot of mappers and I think Fredrick has commented this sort of mapping with new mappers needs a lot of clean up effort sometimes more than required for an experienced mapper to map it right in the first place. Montreal has identified there just aren't enough experienced mappers available.

I worked at Stats Canada for a number of years.  The corporate culture is very different to OSM.  It makes its money by selling data.  Want to open a new coffee bar? Stats Canada will combine its data to sell you the ideal spot based on residents' income etc..  I had a meeting with Stats Canada, City of Ottawa planning department, an Open Data specialist from Carlton University, someone from Metrolink who had added data to openstreetmap to help people find the nearest bus stop and a couple of HOT board members.   We convinced Stats Canada to change the direction of the pilot to use Open Data rather than go the mapathon route partly for data quality reasons and partly because I didn't think we could find the mappers to map the buildings completely.  The Stat Can involvement meant the City of Ottawa was persuaded to change its Open Data license to the same as the TB one.  That took time and had to go to council for approval.  There was a lot of discussion with the local community and it was they who organised and did the import.   The local group worked nicely together and had a range of skill sets in the group.  I actually played more of a connecting role than anything else.

The import was challenged on the data license amongst other things but eventually the OSM legal working group was very kind and ruled the license was acceptable.  Stats is very interested in added detail to buildings.  I was very interested that we could now import the bus stops.

I think you picked up on the fact that the buildings mapped in a mapathon were less than ideal.  I was involved in one in Ottawa and just taught the new mappers to use JOSM and the building_tool.  That produced more buildings per mapper hour and they were fairly accurate.  I must confess not every attached garage was mapped in detail.

I seem to recall Mapbox being involved in the Maperthons in some way.

The Stats Can involvement meant we saw some interest from schools.  What I was interested in was added detail so mapped a couple of thousand buildings in Ontario using JONM and the building_tool so details could be added easily.  We got two addresses added.  Apparently in Ontario the provincial government has purchased ESRI for school children to learn about GIS.

At the end of the pilot the money had run out.  Stats covered some of the costs involved in the HOT summit that was held in Ottawa and during that summit phase two was launched but without any real funding.

What Stats could do though was release data from the municipalities under the government Open Data license and that is what they did.  As Jarek has pointed out following the import process is stressful so I volunteered to do the paperwork and submit the plan.  There was some discussion on talk ca and the idea surfaced to go with one plan rather than divide the country up.  So that's what I did.

Today we have three sources of data that could be imported, and I suspect the two that are not municipal data are more consistent.  We still have the original plan of mapathons with iD floating around.

My person view is the imported data quality is better than the mapathon approach but to go forward from here I think it needs to be re-planned and a new import plan(s) drawn up.

I don't think Stats have any real funding available at the moment.  They may find an odd hour in a quiet time but its coming up to March 31th and deadline time so I don't expect any major resources to be made available from them certainly not of the data clean up variety.

Cheerio John

Pierre Béland wrote on 2019-09-28 12:20 PM:
Je comprends que c'est la saison des tomates. Mais essayons de les utiliser pour nos conserves et non comme argument pour convaincre les autres contributeurs ! ;)

Comme les autres l'ont exprimé, c'est à ceux qui proposent de faire des imports de bien documenter le processus, non l'inverse. Et les menaces d'agir de façon impériale et négliger les communautés locales, cela ne tient évidemment pas la route.

Pour discuter sur la qualité des données, il est nécessaire de pouvoir facilement examiner les données. Et je ne penses pas que les données soient comparables d'un endroit à l'autre. La qualité des images, la densité du bâti en milieu urbains sont autant de facteurs.

Les fichiers accessibles aussi bien pour StatCan que Microsoft sont très gros. Simplement pour analyser les données de nos municipalités respectives, il faut traiter de gros fichiers et tenter d'extraire les données. Ce qui n'est pas nécessairement facile et va bien sûr limiter la participation.

Question de donner des exemples sur les limites d'observation des images par les technique de AI, j'ai publié des images avec les 2 tweets suivants montrant des bâtiments au centre de Toronto :

https://twitter.com/pierzen/status/1177976517902684160

https://twitter.com/pierzen/status/1177978125377884160

On voit bien qu'il ne suffit pas de valider si les angles sont droits. Ces exemples montrent bien comment le tracé peut varier significativement vs la réalité au sol. Et tout comme les humains, les techniques de AI ont de la difficulté à identifier les bâtiments individuels.

cordialement

--
Sent from Postbox <https://www.postbox-inc.com>
_______________________________________________
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca

Reply via email to