Paul Norman wrote:
From: Lester Caine [mailto:les...@lsces.co.uk]
Subject: Re: [OSM-talk] Import guidelines proposal update

who last edited an object! ). Where the import HAS nice unique object
identifiers things are a lot easier, but raw vector data like the French
import, and I think the Spanish data you are talking about CAN still be
'diffed' against earlier imports, and result in perhaps new data that
can simply be imported, or perhaps an overlay that identifies conflicts
that need a human eye. Isn't it better to spend time working out a GOOD
way of using the data going forward rather than having to manually merge
the whole lot again in a couple of years time ... and every couple of
years.

My thoughts on how to handle this for data with persistent unique
identifiers without adding those as tags is to

******
a. Record the correspondence between source ID and temporary pre-upload
negative OSM ID

b. Record the correspondence between pre-upload negative OSM ID and OSM ID

c. Combine for a correspondence between source ID and OSM ID, and save this
******
EXCEPT - that requires ALL the data from the external import to be loaded in order to create the OSM ID which may not be a bad thing? ... BUT Part of the 'preprocessing' before ever uploading the import would be to identify which objects are going to be uploaded and which not, so you need to create an 'id' initially related to the data source? That is providing that the data source is actually identifiable data.

What I had not considered up until now is if the data source is simply a raw vector file with version of a paper map, then while the individual lines could be 'imported' the data is almost useless until it has been 'identified'? You may just as well simply trace? But even here all is not lost since one can still pre-process the data and provide the link back as to which lines have been copied and which not. In which case the OSM ID provides additional data back to the source, but I doubt that there is any value simply importing millions of lines segments directly into the main database? This has to be a secondary staging area to handle that data?

d. When updating, identify objects that have changed or been added to the
source

e. For changed or deleted objects if the OSM object was last edited by the
importer's import account, upload a new version reflecting the changes.
Objects that have been edited by a person will require manual intervention,
like now

f. Handle new objects like before
        
g. Identify objects deleted in OSM and check these, then submit corrections
to the source.

The one case this doesn't handle very well is POIs that have been changed
from a node into a way.

I'm going to be working on implementing this in a limited way for updating
addresses locally. Addresses are different because the address should be
unique in the city.

While the UK 'address database' can't be uploaded freely yet, I have been slowly importing data manually, and it just irritates that every building has duplicates of much of this data. I know a few attempts have been made at relations and the like to group stuff, but as I have said in the past, isn't now the time to provide a mechanism that uses 'lookup tables' for some of this which will automatically simplify what is stored in the tags against each object? An 'address' in the UK only needs to be the 'property id' - house/flat number or name - and the 'postcode'. Everything else can be provided by a 'lookup' on the postcode reference. Now this ACTUALLY does not work simply because the 'postcode' has too many edge cases where you need additional information to provide 'street'. That is why the nlpg data does not use it and simply provides a reference to it deep inside. It provides a street gazetteer with a clean reference number for each street ( and in theory a 'way' for the physical location, but in most cases this is just a couple of 'end points' :( ) This is the sort of process that could simplify a LOT of the micro/macro mapping problems that are now building up, since a world wide 'street gazetteer' is the base for all of the routing programs? And a top level map in its own right? All of the problems of 'turns relations' would be managed in the 'street gazetteer', while the underlying map can display all of the pretty stuff such as the grass verges, footpaths and the like?

--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

_______________________________________________
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk

Reply via email to