Daniel,

The section you cite from the import guidelines seems to support my position. Perhaps that means I haven't explained my thoughts well? It says:

"If you are importing data where there is already some data in OSM, then you need to combine this data in an appropriate way or suppress the import of features with overlap with existing data. Only where you have explicit support, which is highly unusual, should you replace current data with your imported data."

OSM is not all that dynamic - if it was, we wouldn't be talking about an import. We should do our best to align the data we're thinking of bringing in with gaps in the existing data. That's what conflation is all about. We should not systematically change existing data. That's called an automated edit and is governed by different policies than an import.

I'm not saying any plan is perfect - there will be gaps and overlaps, of course. But that doesn't mean we don't try at all.

Toronto has hundreds of thousands of buildings in OSM. They are not going to be compared manually, one by one if there is an import.

--

And again, I'm not saying not to do explicit building conflation and check for updated geometries - I'm just saying that it needs to be a separate process if it happens.

Best,

Nate Wessel, PhD
Planner, Cartographer, Transport Nerd
NateWessel.com <https://www.natewessel.com>

On 2020-01-16 2:56 p.m., Daniel @jfd553 wrote:

Hello Nate,

I understand that you don’t like to see an import process that both bring in new objects and overwrite existing ones. You also suggest removing "overlapped" building from ODB prior to import it. Such pre-processing, that would ensure there will be no buildings "overwrite" during the import, is not realistic (i.e. you will need to overwrite some buildings anyway). Here are two reasons why it would be difficult...

1-OSM is a dynamic project and, unless you can "clean" the data on the fly, you will end up with overlaps since some contributors will have added buildings in the meantime.

2-One cannot assume that an OSM building, and its ODB counterpart, will be found at the same location (look at DX and DY columns in ODB inventory tables). These are averages, which means there are larger offsets between both datasets (i.e. you won’t get a match between buildings, or get a match with the wrong ones).

The only realistic option is then to manually delete the ODB buildings if they overlap OSM ones. Here is what the import guideline suggests [1]…

"If you are importing data where there is already some data in OSM, then you need to combine this data in an appropriate way or suppress the import of features with overlap with existing data."

Therefore, importing data and using a conflation process is not unusual. Again, I understand that in case of an overlap, you go for the last option (suppress the import building). I am rather inclined toward using conflation when necessary, which means...

-Importing an ODB building when there is no corresponding one in OSM;

-Conflating both ODB and OSM buildings when it significantly improves existing OSM content;

-Not importing an ODB building when the corresponding one in OSM is adequate.

What do the others on the list think?

Daniel

[1] https://wiki.openstreetmap.org/wiki/Import/Guidelines#Don.27t_put_data_on_top_of_data

*From:*Nate Wessel [mailto:bike...@gmail.com]
*Sent:* Thursday, January 16, 2020 10:38
*To:* Daniel @jfd553; talk-ca@openstreetmap.org
*Subject:* Re: [Talk-ca] FW: Re: Importing buildings in Canada

Responding to point C below,
I would strongly suggest that we not confuse the process of importing new data with that of updating/modifying existing data in the OSM database. One of the things I really disliked about the initial building import was that it overwrote existing data at the same time that new data was brought in. These are really two separate import processes and require very different considerations.

We can certainly consider using this dataset to improve/update existing building geometries, but I think that is a separate process from the import we are discussing here. To keep things simple for this import, I would suggest removing any building from the import dataset that intersects with an existing building in the OSM database. That is, let's not worry about conflation for now, and come back and do that work later if we still feel there is a strong need for it.

I see the main point of this effort as getting more complete coverage - it we want to use the dataset to do quality assurance on existing data, that is a whole other discussion.

Best,

Nate Wessel, PhD
Planner, Cartographer, Transport Nerd
NateWessel.com <https://www.natewessel.com>

On 2020-01-15 12:55 p.m., Daniel @jfd553 wrote:

    Thanks for the quick replies!

    Now, about...

    *a) Data hosting:*

    Thank you James, I really appreciate your offer (and that of
    others). So yes, I think hosting pre-processed data in the task
    manager, for approved regions, is an attractive offer. When we
    agree on a municipality for pre-processing, I will contact you to
    make the data available.

    BTW, I thought ODB data in OSM format was hosted with the
    OSMCanada task manager. I understand that ODB data are currently
    converted on the fly when requested?

    *b) Task manager work units for import:*

    I agree with Nate, ~ 200 buildings or ~ 1,500 nodes would be
    suitable. I was thinking at the same importation rate, but for an
    hour of work. It seems best to target 20-minute tasks.

    *c) Task manager work units for checking already imported data*

    According to Nate, it is definitely not faster than actively
    importing. We should then keep the above setup (b).

    However, what if I add a new tag to pre-processed data indicating
    if a building was altered or not by the orthogonalization (and
    simplification) process? For instance, /building:altered=no/,
    would identify buildings that were not changed by the process and
    that could be left unchanged in OSM (i.e. not imported);
    /building:altered=yes/ for those who were changed by the process
    and that should be imported again. The same pre-processed datasets
    could then be made available for all cases. Thoughts?

    *d) Finding local mappers:*

    I agree with Nate’s suggestion to try contacting the top 10
    mappers in an area. Using the "main activity center" would work
    for most of the contributors but selecting other overlays (.e.g.
    an activity center over last 6 months) could also work great. As
    long as we identify who might be interested in knowing there is an
    import coming.

    Comments are welcome, particularly about the proposal on c)

    Daniel

_______________________________________________
Talk-ca mailing list
Talk-ca@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-ca

Reply via email to