Daniel,
The section you cite from the import guidelines seems to support my
position. Perhaps that means I haven't explained my thoughts well? It says:
"If you are importing data where there is already some data in OSM, then
you need to combine this data in an appropriate way or suppress the
import of features with overlap with existing data. Only where you have
explicit support, which is highly unusual, should you replace current
data with your imported data."
OSM is not all that dynamic - if it was, we wouldn't be talking about an
import. We should do our best to align the data we're thinking of
bringing in with gaps in the existing data. That's what conflation is
all about. We should not systematically change existing data. That's
called an automated edit and is governed by different policies than an
import.
I'm not saying any plan is perfect - there will be gaps and overlaps, of
course. But that doesn't mean we don't try at all.
Toronto has hundreds of thousands of buildings in OSM. They are not
going to be compared manually, one by one if there is an import.
--
And again, I'm not saying not to do explicit building conflation and
check for updated geometries - I'm just saying that it needs to be a
separate process if it happens.
Best,
Nate Wessel, PhD
Planner, Cartographer, Transport Nerd
NateWessel.com <https://www.natewessel.com>
On 2020-01-16 2:56 p.m., Daniel @jfd553 wrote:
Hello Nate,
I understand that you don’t like to see an import process that both
bring in new objects and overwrite existing ones. You also suggest
removing "overlapped" building from ODB prior to import it. Such
pre-processing, that would ensure there will be no buildings
"overwrite" during the import, is not realistic (i.e. you will need to
overwrite some buildings anyway). Here are two reasons why it would be
difficult...
1-OSM is a dynamic project and, unless you can "clean" the data on the
fly, you will end up with overlaps since some contributors will have
added buildings in the meantime.
2-One cannot assume that an OSM building, and its ODB counterpart,
will be found at the same location (look at DX and DY columns in ODB
inventory tables). These are averages, which means there are larger
offsets between both datasets (i.e. you won’t get a match between
buildings, or get a match with the wrong ones).
The only realistic option is then to manually delete the ODB buildings
if they overlap OSM ones. Here is what the import guideline suggests [1]…
"If you are importing data where there is already some data in OSM,
then you need to combine this data in an appropriate way or suppress
the import of features with overlap with existing data."
Therefore, importing data and using a conflation process is not
unusual. Again, I understand that in case of an overlap, you go for
the last option (suppress the import building). I am rather inclined
toward using conflation when necessary, which means...
-Importing an ODB building when there is no corresponding one in OSM;
-Conflating both ODB and OSM buildings when it significantly improves
existing OSM content;
-Not importing an ODB building when the corresponding one in OSM is
adequate.
What do the others on the list think?
Daniel
[1]
https://wiki.openstreetmap.org/wiki/Import/Guidelines#Don.27t_put_data_on_top_of_data
*From:*Nate Wessel [mailto:bike...@gmail.com]
*Sent:* Thursday, January 16, 2020 10:38
*To:* Daniel @jfd553; talk-ca@openstreetmap.org
*Subject:* Re: [Talk-ca] FW: Re: Importing buildings in Canada
Responding to point C below,
I would strongly suggest that we not confuse the process of importing
new data with that of updating/modifying existing data in the OSM
database. One of the things I really disliked about the initial
building import was that it overwrote existing data at the same time
that new data was brought in. These are really two separate import
processes and require very different considerations.
We can certainly consider using this dataset to improve/update
existing building geometries, but I think that is a separate process
from the import we are discussing here. To keep things simple for this
import, I would suggest removing any building from the import dataset
that intersects with an existing building in the OSM database. That
is, let's not worry about conflation for now, and come back and do
that work later if we still feel there is a strong need for it.
I see the main point of this effort as getting more complete coverage
- it we want to use the dataset to do quality assurance on existing
data, that is a whole other discussion.
Best,
Nate Wessel, PhD
Planner, Cartographer, Transport Nerd
NateWessel.com <https://www.natewessel.com>
On 2020-01-15 12:55 p.m., Daniel @jfd553 wrote:
Thanks for the quick replies!
Now, about...
*a) Data hosting:*
Thank you James, I really appreciate your offer (and that of
others). So yes, I think hosting pre-processed data in the task
manager, for approved regions, is an attractive offer. When we
agree on a municipality for pre-processing, I will contact you to
make the data available.
BTW, I thought ODB data in OSM format was hosted with the
OSMCanada task manager. I understand that ODB data are currently
converted on the fly when requested?
*b) Task manager work units for import:*
I agree with Nate, ~ 200 buildings or ~ 1,500 nodes would be
suitable. I was thinking at the same importation rate, but for an
hour of work. It seems best to target 20-minute tasks.
*c) Task manager work units for checking already imported data*
According to Nate, it is definitely not faster than actively
importing. We should then keep the above setup (b).
However, what if I add a new tag to pre-processed data indicating
if a building was altered or not by the orthogonalization (and
simplification) process? For instance, /building:altered=no/,
would identify buildings that were not changed by the process and
that could be left unchanged in OSM (i.e. not imported);
/building:altered=yes/ for those who were changed by the process
and that should be imported again. The same pre-processed datasets
could then be made available for all cases. Thoughts?
*d) Finding local mappers:*
I agree with Nate’s suggestion to try contacting the top 10
mappers in an area. Using the "main activity center" would work
for most of the contributors but selecting other overlays (.e.g.
an activity center over last 6 months) could also work great. As
long as we identify who might be interested in knowing there is an
import coming.
Comments are welcome, particularly about the proposal on c)
Daniel
_______________________________________________
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca