Re: [OSM-talk] Import guidelines proposal update

Lester Caine Fri, 21 Sep 2012 23:48:48 -0700

Paul Norman wrote:

From: Lester Caine [mailto:les...@lsces.co.uk]
Subject: Re: [OSM-talk] Import guidelines proposal update


who last edited an object! ). Where the import HAS nice unique object
identifiers things are a lot easier, but raw vector data like the French
import, and I think the Spanish data you are talking about CAN still be
'diffed' against earlier imports, and result in perhaps new data that
can simply be imported, or perhaps an overlay that identifies conflicts
that need a human eye. Isn't it better to spend time working out a GOOD
way of using the data going forward rather than having to manually merge
the whole lot again in a couple of years time ... and every couple of
years.


My thoughts on how to handle this for data with persistent unique
identifiers without adding those as tags is to


******

a. Record the correspondence between source ID and temporary pre-upload
negative OSM ID

b. Record the correspondence between pre-upload negative OSM ID and OSM ID

c. Combine for a correspondence between source ID and OSM ID, and save this

******

EXCEPT - that requires ALL the data from the external import to be loaded inorder to create the OSM ID which may not be a bad thing? ... BUTPart of the 'preprocessing' before ever uploading the import would be toidentify which objects are going to be uploaded and which not, so you need tocreate an 'id' initially related to the data source? That is providing that thedata source is actually identifiable data.

What I had not considered up until now is if the data source is simply a rawvector file with version of a paper map, then while the individual lines couldbe 'imported' the data is almost useless until it has been 'identified'? You mayjust as well simply trace? But even here all is not lost since one can stillpre-process the data and provide the link back as to which lines have beencopied and which not. In which case the OSM ID provides additional data back tothe source, but I doubt that there is any value simply importing millions oflines segments directly into the main database? This has to be a secondarystaging area to handle that data?

d. When updating, identify objects that have changed or been added to the
source

e. For changed or deleted objects if the OSM object was last edited by the
importer's import account, upload a new version reflecting the changes.
Objects that have been edited by a person will require manual intervention,
like now

f. Handle new objects like before
        
g. Identify objects deleted in OSM and check these, then submit corrections
to the source.

The one case this doesn't handle very well is POIs that have been changed
from a node into a way.

I'm going to be working on implementing this in a limited way for updating
addresses locally. Addresses are different because the address should be
unique in the city.

While the UK 'address database' can't be uploaded freely yet, I have been slowlyimporting data manually, and it just irritates that every building hasduplicates of much of this data. I know a few attempts have been made atrelations and the like to group stuff, but as I have said in the past, isn't nowthe time to provide a mechanism that uses 'lookup tables' for some of this whichwill automatically simplify what is stored in the tags against each object? An'address' in the UK only needs to be the 'property id' - house/flat number orname - and the 'postcode'. Everything else can be provided by a 'lookup' on thepostcode reference. Now this ACTUALLY does not work simply because the'postcode' has too many edge cases where you need additional information toprovide 'street'. That is why the nlpg data does not use it and simply providesa reference to it deep inside. It provides a street gazetteer with a cleanreference number for each street ( and in theory a 'way' for the physicallocation, but in most cases this is just a couple of 'end points' :( ) This isthe sort of process that could simplify a LOT of the micro/macro mappingproblems that are now building up, since a world wide 'street gazetteer' is thebase for all of the routing programs? And a top level map in its own right? Allof the problems of 'turns relations' would be managed in the 'street gazetteer',while the underlying map can display all of the pretty stuff such as the grassverges, footpaths and the like?


--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

_______________________________________________
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Re: [OSM-talk] Import guidelines proposal update

Reply via email to