And I totally agree. Because the Stat Can data has come from many
sources the data quality is variable to put it politely. The Microsoft
data has been shown in the US to also be of variable quality. I'm not
so sure about the NR Can LiDAR data hopefully it is at least consistent.
If we look at the history of the project then we can get an idea of how
we came to be where we are.
First I wanted to import all the bus stops in Ottawa because only by
importing could you ensure you had all the stops with their reference
numbers but the City of Ottawa Open Data license did not align with
OSM. I was also talking to Treasury Board and explained to them that
their Open Data license version 1 didn't align with OSM so we couldn't
use their data. Five years later TB released their Open Data license
version 2 which they felt did align.
Stat Can has two types of project, pilot ones and ones that earn money.
The original pilot was based on Ottawa and Gatineau and was for two
years. Their original plan was mapathons using iD. I was impressed
when a stat can employee managed to accurately map a building using iD
during a presentation. I hadn't thought it was possible after some of
the efforts I'd seen in HOT mapping. Fine except that it requires a
lot of mappers and I think Fredrick has commented this sort of mapping
with new mappers needs a lot of clean up effort sometimes more than
required for an experienced mapper to map it right in the first place.
Montreal has identified there just aren't enough experienced mappers
available.
I worked at Stats Canada for a number of years. The corporate culture
is very different to OSM. It makes its money by selling data. Want to
open a new coffee bar? Stats Canada will combine its data to sell you
the ideal spot based on residents' income etc.. I had a meeting with
Stats Canada, City of Ottawa planning department, an Open Data
specialist from Carlton University, someone from Metrolink who had added
data to openstreetmap to help people find the nearest bus stop and a
couple of HOT board members. We convinced Stats Canada to change the
direction of the pilot to use Open Data rather than go the mapathon
route partly for data quality reasons and partly because I didn't think
we could find the mappers to map the buildings completely. The Stat Can
involvement meant the City of Ottawa was persuaded to change its Open
Data license to the same as the TB one. That took time and had to go to
council for approval. There was a lot of discussion with the local
community and it was they who organised and did the import. The local
group worked nicely together and had a range of skill sets in the
group. I actually played more of a connecting role than anything else.
The import was challenged on the data license amongst other things but
eventually the OSM legal working group was very kind and ruled the
license was acceptable. Stats is very interested in added detail to
buildings. I was very interested that we could now import the bus stops.
I think you picked up on the fact that the buildings mapped in a
mapathon were less than ideal. I was involved in one in Ottawa and just
taught the new mappers to use JOSM and the building_tool. That produced
more buildings per mapper hour and they were fairly accurate. I must
confess not every attached garage was mapped in detail.
I seem to recall Mapbox being involved in the Maperthons in some way.
The Stats Can involvement meant we saw some interest from schools. What
I was interested in was added detail so mapped a couple of thousand
buildings in Ontario using JONM and the building_tool so details could
be added easily. We got two addresses added. Apparently in Ontario the
provincial government has purchased ESRI for school children to learn
about GIS.
At the end of the pilot the money had run out. Stats covered some of
the costs involved in the HOT summit that was held in Ottawa and during
that summit phase two was launched but without any real funding.
What Stats could do though was release data from the municipalities
under the government Open Data license and that is what they did. As
Jarek has pointed out following the import process is stressful so I
volunteered to do the paperwork and submit the plan. There was some
discussion on talk ca and the idea surfaced to go with one plan rather
than divide the country up. So that's what I did.
Today we have three sources of data that could be imported, and I
suspect the two that are not municipal data are more consistent. We
still have the original plan of mapathons with iD floating around.
My person view is the imported data quality is better than the mapathon
approach but to go forward from here I think it needs to be re-planned
and a new import plan(s) drawn up.
I don't think Stats have any real funding available at the moment. They
may find an odd hour in a quiet time but its coming up to March 31th and
deadline time so I don't expect any major resources to be made available
from them certainly not of the data clean up variety.
Cheerio John
Pierre Béland wrote on 2019-09-28 12:20 PM:
Je comprends que c'est la saison des tomates. Mais essayons de les
utiliser pour nos conserves et non comme argument pour convaincre les
autres contributeurs ! ;)
Comme les autres l'ont exprimé, c'est à ceux qui proposent de faire
des imports de bien documenter le processus, non l'inverse. Et les
menaces d'agir de façon impériale et négliger les communautés locales,
cela ne tient évidemment pas la route.
Pour discuter sur la qualité des données, il est nécessaire de pouvoir
facilement examiner les données. Et je ne penses pas que les données
soient comparables d'un endroit à l'autre. La qualité des images, la
densité du bâti en milieu urbains sont autant de facteurs.
Les fichiers accessibles aussi bien pour StatCan que Microsoft sont
très gros. Simplement pour analyser les données de nos municipalités
respectives, il faut traiter de gros fichiers et tenter d'extraire les
données. Ce qui n'est pas nécessairement facile et va bien sûr limiter
la participation.
Question de donner des exemples sur les limites d'observation des
images par les technique de AI, j'ai publié des images avec les 2
tweets suivants montrant des bâtiments au centre de Toronto :
https://twitter.com/pierzen/status/1177976517902684160
https://twitter.com/pierzen/status/1177978125377884160
On voit bien qu'il ne suffit pas de valider si les angles sont droits.
Ces exemples montrent bien comment le tracé peut varier
significativement vs la réalité au sol. Et tout comme les humains, les
techniques de AI ont de la difficulté à identifier les bâtiments
individuels.
cordialement
--
Sent from Postbox <https://www.postbox-inc.com>
_______________________________________________
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca