Re: [OSM-talk] Semi-automated edits - postal code database
Hi. This is an update to an e-mail I sent at the beginning of October to the talk@osm list regarding updating postal codes in Iceland semi-automatically. I wanted to let you know I have written the script, which is for Python 3.2. I have not yet submitted data made by the script but I haven't detected any problems thus far. I have performed some random manual checks on the output and see nothing wrong with the XML. JOSM didn't complain when I opened the .osc file. The input is any valid .osm file and the output is an .osc file ( https://wiki.openstreetmap.org/wiki/Osc) which lists any changes made. The output can be loaded into an editor and submitted to the OSM server from there. You're free to adapt the script to suit your purpose but I recommend that you always check the proposed changes before uploading. The code is commented enough so anybody who knows Python should be able to know what's going on there. Minimum requirements: - Enough computer memory. The larger the .osm file, the more memory the script needs. - Python 3. - A working installation of the Osmosis program ( https://wiki.openstreetmap.org/wiki/Osmosis). - Svavar Kjarrval On 04/10/12 23:48, Martin Guttesen wrote: > I have imported all the addresses for Faroe Islands > and updating them from time to time when there is new data available > see http://wiki.openstreetmap.org/wiki/Import/Catalogue/usfo > i keep an Id tag (us.fo:Adressutal) so i can Create/Update or Delete > address nodes > > > -Original Message- From: Jochen Topf > Sent: Thursday, October 04, 2012 7:39 AM > To: Svavar Kjarrval > Cc: talk@openstreetmap.org > Subject: Re: [OSM-talk] Semi-automated edits - postal code database > > Hi! > > On Wed, Oct 03, 2012 at 11:10:05AM +, Svavar Kjarrval wrote: >> I'm trying to find a good method to maintain data from outside sources. >> The data in question is the Icelandic postal code database (which they >> say we may use freely). My searches on the OSM wiki have been fruitless >> so far. >> >> The idea is to maintain the data in associatedStreet relations. Each >> relation has a tag called 'götuskrá:id' which value is a direct >> reference to the row ID in the files we retrieve from the postal >> company's website. The file formats available are CVS and XML 1.0. The >> script would presumably go ever each associatedStreet relation and make >> any changes (if appropriate) when a götuskrá:id tag is found. The output >> could be an OSM change file loaded into an editor like JOSM to be >> uploaded manually. Maybe an automated process later when we're confident >> that everything is done correctly, and of course after submitting the >> script(s) for review by the local community. > > It is not a good idea to add some random ID of your favourite database to > OSM, because nobody except you can understand this ID and do useful > things > with it. It just confuses mappers and make it more difficult to edit the > data. For every change somebody does to the data they have to know > what this > tag means so that they can properly do their edit. And if they don't, > people > will just mess up your data and you will not be able to use this ID for > syncing the data anyways. > > And in this case I don't even see why you need it. You have street > names and > postal codes in both OSM and the Icelandic postal code database. If > something > changes you can find out which combinations changed and apply those > changes > to OSM easily just based on the postal code and street name. There is no > need for those IDs. > > And, btw, you should not use the associatedStreet relation. It solves > the same > problem as the addr:street tags on nodes and buildings but in a much more > complicated way. The overwhelming majority of all addresses are tagged > with > addr:street (there are nearly 15 million addr:street tags vs. only 18.000 > associatedStreet relations). > > Jochen #!/usr/bin/env python3.2 # -*- coding: utf-8 -*- # Copyright 2012, Svavar Kjarrval Lúthersson # Released under the CC0 license. # I can be contacted at sva...@kjarrval.is. # This program performs changes according to pretermined formulas to .osm files # and outputs a single .osc file which in turn can either be submitted automatically # by another program (which is not implemented here) or manually with an editor. # To use it, you must have: # 1 - An .osm file of the area in question. # 2 - An Osmosis binary set up and ready to use. # The reason the script filters instead of working directly on the original file # is to reduce memory consumption of programs which need to load the complete .osm file into memory. # If, despite having done proper filtering, the .osm file is still
Re: [OSM-talk] Semi-automated edits - postal code database
I have imported all the addresses for Faroe Islands and updating them from time to time when there is new data available see http://wiki.openstreetmap.org/wiki/Import/Catalogue/usfo i keep an Id tag (us.fo:Adressutal) so i can Create/Update or Delete address nodes -Original Message- From: Jochen Topf Sent: Thursday, October 04, 2012 7:39 AM To: Svavar Kjarrval Cc: talk@openstreetmap.org Subject: Re: [OSM-talk] Semi-automated edits - postal code database Hi! On Wed, Oct 03, 2012 at 11:10:05AM +, Svavar Kjarrval wrote: I'm trying to find a good method to maintain data from outside sources. The data in question is the Icelandic postal code database (which they say we may use freely). My searches on the OSM wiki have been fruitless so far. The idea is to maintain the data in associatedStreet relations. Each relation has a tag called 'götuskrá:id' which value is a direct reference to the row ID in the files we retrieve from the postal company's website. The file formats available are CVS and XML 1.0. The script would presumably go ever each associatedStreet relation and make any changes (if appropriate) when a götuskrá:id tag is found. The output could be an OSM change file loaded into an editor like JOSM to be uploaded manually. Maybe an automated process later when we're confident that everything is done correctly, and of course after submitting the script(s) for review by the local community. It is not a good idea to add some random ID of your favourite database to OSM, because nobody except you can understand this ID and do useful things with it. It just confuses mappers and make it more difficult to edit the data. For every change somebody does to the data they have to know what this tag means so that they can properly do their edit. And if they don't, people will just mess up your data and you will not be able to use this ID for syncing the data anyways. And in this case I don't even see why you need it. You have street names and postal codes in both OSM and the Icelandic postal code database. If something changes you can find out which combinations changed and apply those changes to OSM easily just based on the postal code and street name. There is no need for those IDs. And, btw, you should not use the associatedStreet relation. It solves the same problem as the addr:street tags on nodes and buildings but in a much more complicated way. The overwhelming majority of all addresses are tagged with addr:street (there are nearly 15 million addr:street tags vs. only 18.000 associatedStreet relations). Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Semi-automated edits - postal code database
> From: Christian Quest [mailto:cqu...@openstreetmap.fr] > Sent: Thursday, October 04, 2012 12:58 AM > To: talk@openstreetmap.org > Subject: Re: [OSM-talk] Semi-automated edits - postal code database > > 2012/10/4 Jochen Topf : > > And, btw, you should not use the associatedStreet relation. It solves > > the same problem as the addr:street tags on nodes and buildings but in > > a much more complicated way. The overwhelming majority of all > > addresses are tagged with addr:street (there are nearly 15 million > > addr:street tags vs. only 18.000 associatedStreet relations). > > Direct comparison of number of addr:street tags and associatedStreet > relations is not that simple. > How many addresses are behind the associatedStreet relations ? And how many associatedStreets don't have addresses at all? http://www.openstreetmap.org/browse/relation/2523 doesn't have any members except for streets. A more accurate count would be how many relation members have the type house and are also a member of a relatedStreet relation. The answer is 1128546 objects. Broken down by object type, this is 656010 nodes, 658 relations and 471878 ways. So there is about a 13:1 preference in the database for addr:street over relations. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Semi-automated edits - postal code database
On 04.10.2012 14:53, Ed Loach wrote: > But how many of the 15 million are the results of imports taking the > easy way of using addr:street? Taginfo lists combinations and we > have 2.3 million that also have osak: tags, 0.8 million that also > have kms: tags, then lesser combinations such as uir_adr:ADRESA_KOD, > usar_addr:edit_date, mvdgis:cod_nombre, chicago:building_id and > surrey:addrid and that's only got me to page 5 of 519 of the > combinations. It is true that probably a lot of these are imports. But this might be true for both tagging styles, and you also have to account for the JOSM plugins where the authors decided to automatically create relations. They don't necessarily set tags like that, so they are harder to filter out. > Then you have all the people who have used addr:street instead of > the relation because it seems the more popular option, perhaps only > because of those imports. Then you have all the people who believe that "relations are easier to use for computers" - after all, why would anyone use that confusing concept otherwise? -, and therefore suffer through them because they mistakenly believe that it makes their data "better". Or they think that addr:street is outdated because relations as a whole are newer than other elements. (I've encountered both of these beliefs.) Imo, addr:street is more straightforward to understand, makes the common beginner task of entering or fixing an address much more accessible and is therefore preferable over relations. The number of uses is hard to measure, but doesn't really affect these basic arguments anyway. To me, it's associatedStreet which seems out of place in OSM tagging, and that's not just because it uses camelCase for some reason. ;) Tobias ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Semi-automated edits - postal code database
> Okay sorry. Worldwide we have about 16 million > addr:housenumber tags and about > 15 million addr:street tags. So there is no addr:street for about 1 > mio > housenumbers. Presumably thats because they are members in an > associatedStreet > relation. (It could also be because it is easy to find the right street, > because it is the one next to the house, but lets ignore those cases.) > So > its still less than 10%. But how many of the 15 million are the results of imports taking the easy way of using addr:street? Taginfo lists combinations and we have 2.3 million that also have osak: tags, 0.8 million that also have kms: tags, then lesser combinations such as uir_adr:ADRESA_KOD, usar_addr:edit_date, mvdgis:cod_nombre, chicago:building_id and surrey:addrid and that's only got me to page 5 of 519 of the combinations. Then you have all the people who have used addr:street instead of the relation because it seems the more popular option, perhaps only because of those imports. Ed ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Semi-automated edits - postal code database
On Thu, Oct 04, 2012 at 09:58:02AM +0200, Christian Quest wrote: > 2012/10/4 Jochen Topf : > > And, btw, you should not use the associatedStreet relation. It solves the > > same > > problem as the addr:street tags on nodes and buildings but in a much more > > complicated way. The overwhelming majority of all addresses are tagged with > > addr:street (there are nearly 15 million addr:street tags vs. only 18.000 > > associatedStreet relations). > > > > Direct comparison of number of addr:street tags and associatedStreet > relations is not that simple. Okay sorry. Worldwide we have about 16 million addr:housenumber tags and about 15 million addr:street tags. So there is no addr:street for about 1 mio housenumbers. Presumably thats because they are members in an associatedStreet relation. (It could also be because it is easy to find the right street, because it is the one next to the house, but lets ignore those cases.) So its still less than 10%. Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Semi-automated edits - postal code database
2012/10/4 Jochen Topf : > And, btw, you should not use the associatedStreet relation. It solves the same > problem as the addr:street tags on nodes and buildings but in a much more > complicated way. The overwhelming majority of all addresses are tagged with > addr:street (there are nearly 15 million addr:street tags vs. only 18.000 > associatedStreet relations). > Direct comparison of number of addr:street tags and associatedStreet relations is not that simple. How many addresses are behind the associatedStreet relations ? For example in France, we currently have: - 27730 associatedStreet relations - 472941 members with the "house" role - 395895 on 761051 nodes (52%) and 78541 on 187193 ways (42%) with "addr:housenumber" in these relations, so a total of 50% of addresses are in associatedStreet relations. This is also due to JOSM plugin we use to simplify creating addresses which automatically takes care of all the associatedStreet relation stuff. We also developed quality assurance analysis on our "osmose" tool to make sure the addresses are coherent (unique addr:number in one relation, unique relation for one addr:street in a town, limited distance between addr:housenumber nodes/ways and the street highway, etc). -- Christian Quest - OpenStreetMap France - http://openstreetmap.fr/u/cquest ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] Semi-automated edits - postal code database
Hi! On Wed, Oct 03, 2012 at 11:10:05AM +, Svavar Kjarrval wrote: > I'm trying to find a good method to maintain data from outside sources. > The data in question is the Icelandic postal code database (which they > say we may use freely). My searches on the OSM wiki have been fruitless > so far. > > The idea is to maintain the data in associatedStreet relations. Each > relation has a tag called 'götuskrá:id' which value is a direct > reference to the row ID in the files we retrieve from the postal > company's website. The file formats available are CVS and XML 1.0. The > script would presumably go ever each associatedStreet relation and make > any changes (if appropriate) when a götuskrá:id tag is found. The output > could be an OSM change file loaded into an editor like JOSM to be > uploaded manually. Maybe an automated process later when we're confident > that everything is done correctly, and of course after submitting the > script(s) for review by the local community. It is not a good idea to add some random ID of your favourite database to OSM, because nobody except you can understand this ID and do useful things with it. It just confuses mappers and make it more difficult to edit the data. For every change somebody does to the data they have to know what this tag means so that they can properly do their edit. And if they don't, people will just mess up your data and you will not be able to use this ID for syncing the data anyways. And in this case I don't even see why you need it. You have street names and postal codes in both OSM and the Icelandic postal code database. If something changes you can find out which combinations changed and apply those changes to OSM easily just based on the postal code and street name. There is no need for those IDs. And, btw, you should not use the associatedStreet relation. It solves the same problem as the addr:street tags on nodes and buildings but in a much more complicated way. The overwhelming majority of all addresses are tagged with addr:street (there are nearly 15 million addr:street tags vs. only 18.000 associatedStreet relations). Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
[OSM-talk] Semi-automated edits - postal code database
Hi. I'm trying to find a good method to maintain data from outside sources. The data in question is the Icelandic postal code database (which they say we may use freely). My searches on the OSM wiki have been fruitless so far. The idea is to maintain the data in associatedStreet relations. Each relation has a tag called 'götuskrá:id' which value is a direct reference to the row ID in the files we retrieve from the postal company's website. The file formats available are CVS and XML 1.0. The script would presumably go ever each associatedStreet relation and make any changes (if appropriate) when a götuskrá:id tag is found. The output could be an OSM change file loaded into an editor like JOSM to be uploaded manually. Maybe an automated process later when we're confident that everything is done correctly, and of course after submitting the script(s) for review by the local community. I can make the script myself in Python if neccessary but decided to find out if somebody has already done all the work before. With regards, Svavar Kjarrval signature.asc Description: OpenPGP digital signature ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk