Re: [OSM-talk] Semi-automated edits - postal code database

2012-11-06 Thread Svavar Kjarrval
Hi.

This is an update to an e-mail I sent at the beginning of October to the
talk@osm list regarding updating postal codes in Iceland semi-automatically.

I wanted to let you know I have written the script, which is for Python
3.2. I have not yet submitted data made by the script but I haven't
detected any problems thus far. I have performed some random manual
checks on the output and see nothing wrong with the XML. JOSM didn't
complain when I opened the .osc file.

The input is any valid .osm file and the output is an .osc file (
https://wiki.openstreetmap.org/wiki/Osc) which lists any changes made.
The output can be loaded into an editor and submitted to the OSM server
from there.

You're free to adapt the script to suit your purpose but I recommend
that you always check the proposed changes before uploading. The code is
commented enough so anybody who knows Python should be able to know
what's going on there.

Minimum requirements:
- Enough computer memory. The larger the .osm file, the more memory the
script needs.
- Python 3.
- A working installation of the Osmosis program (
https://wiki.openstreetmap.org/wiki/Osmosis).

- Svavar Kjarrval

On 04/10/12 23:48, Martin Guttesen wrote:
> I have imported all the addresses for Faroe Islands
> and updating them from time to time when there is new data available
> see http://wiki.openstreetmap.org/wiki/Import/Catalogue/usfo
> i keep an Id tag (us.fo:Adressutal) so i can Create/Update or Delete
> address nodes
>
>
> -Original Message- From: Jochen Topf
> Sent: Thursday, October 04, 2012 7:39 AM
> To: Svavar Kjarrval
> Cc: talk@openstreetmap.org
> Subject: Re: [OSM-talk] Semi-automated edits - postal code database
>
> Hi!
>
> On Wed, Oct 03, 2012 at 11:10:05AM +, Svavar Kjarrval wrote:
>> I'm trying to find a good method to maintain data from outside sources.
>> The data in question is the Icelandic postal code database (which they
>> say we may use freely). My searches on the OSM wiki have been fruitless
>> so far.
>>
>> The idea is to maintain the data in associatedStreet relations. Each
>> relation has a tag called 'götuskrá:id' which value is a direct
>> reference to the row ID in the files we retrieve from the postal
>> company's website. The file formats available are CVS and XML 1.0. The
>> script would presumably go ever each associatedStreet relation and make
>> any changes (if appropriate) when a götuskrá:id tag is found. The output
>> could be an OSM change file loaded into an editor like JOSM to be
>> uploaded manually. Maybe an automated process later when we're confident
>> that everything is done correctly, and of course after submitting the
>> script(s) for review by the local community.
>
> It is not a good idea to add some random ID of your favourite database to
> OSM, because nobody except you can understand this ID and do useful
> things
> with it. It just confuses mappers and make it more difficult to edit the
> data. For every change somebody does to the data they have to know
> what this
> tag means so that they can properly do their edit. And if they don't,
> people
> will just mess up your data and you will not be able to use this ID for
> syncing the data anyways.
>
> And in this case I don't even see why you need it. You have street
> names and
> postal codes in both OSM and the Icelandic postal code database. If
> something
> changes you can find out which combinations changed and apply those
> changes
> to OSM easily just based on the postal code and street name. There is no
> need for those IDs.
>
> And, btw, you should not use the associatedStreet relation. It solves
> the same
> problem as the addr:street tags on nodes and buildings but in a much more
> complicated way. The overwhelming majority of all addresses are tagged
> with
> addr:street (there are nearly 15 million addr:street tags vs. only 18.000
> associatedStreet relations).
>
> Jochen

#!/usr/bin/env python3.2
# -*- coding: utf-8 -*-

# Copyright 2012, Svavar Kjarrval Lúthersson
# Released under the CC0 license.
# I can be contacted at sva...@kjarrval.is.

# This program performs changes according to pretermined formulas to .osm files
# and outputs a single .osc file which in turn can either be submitted automatically
# by another program (which is not implemented here) or manually with an editor.

# To use it, you must have:
# 1 - An .osm file of the area in question.
# 2 - An Osmosis binary set up and ready to use.

# The reason the script filters instead of working directly on the original file
# is to reduce memory consumption of programs which need to load the complete .osm file into memory.
# If, despite having done proper filtering, the .osm file is still

Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-04 Thread Martin Guttesen

I have imported all the addresses for Faroe Islands
and updating them from time to time when there is new data available
see http://wiki.openstreetmap.org/wiki/Import/Catalogue/usfo
i keep an Id tag (us.fo:Adressutal) so i can Create/Update or Delete address 
nodes



-Original Message- 
From: Jochen Topf

Sent: Thursday, October 04, 2012 7:39 AM
To: Svavar Kjarrval
Cc: talk@openstreetmap.org
Subject: Re: [OSM-talk] Semi-automated edits - postal code database

Hi!

On Wed, Oct 03, 2012 at 11:10:05AM +, Svavar Kjarrval wrote:

I'm trying to find a good method to maintain data from outside sources.
The data in question is the Icelandic postal code database (which they
say we may use freely). My searches on the OSM wiki have been fruitless
so far.

The idea is to maintain the data in associatedStreet relations. Each
relation has a tag called 'götuskrá:id' which value is a direct
reference to the row ID in the files we retrieve from the postal
company's website. The file formats available are CVS and XML 1.0. The
script would presumably go ever each associatedStreet relation and make
any changes (if appropriate) when a götuskrá:id tag is found. The output
could be an OSM change file loaded into an editor like JOSM to be
uploaded manually. Maybe an automated process later when we're confident
that everything is done correctly, and of course after submitting the
script(s) for review by the local community.


It is not a good idea to add some random ID of your favourite database to
OSM, because nobody except you can understand this ID and do useful things
with it. It just confuses mappers and make it more difficult to edit the
data. For every change somebody does to the data they have to know what this
tag means so that they can properly do their edit. And if they don't, people
will just mess up your data and you will not be able to use this ID for
syncing the data anyways.

And in this case I don't even see why you need it. You have street names and
postal codes in both OSM and the Icelandic postal code database. If 
something

changes you can find out which combinations changed and apply those changes
to OSM easily just based on the postal code and street name. There is no
need for those IDs.

And, btw, you should not use the associatedStreet relation. It solves the 
same

problem as the addr:street tags on nodes and buildings but in a much more
complicated way. The overwhelming majority of all addresses are tagged with
addr:street (there are nearly 15 million addr:street tags vs. only 18.000
associatedStreet relations).

Jochen
--
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/ 
+49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk 



___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-04 Thread Paul Norman
> From: Christian Quest [mailto:cqu...@openstreetmap.fr]
> Sent: Thursday, October 04, 2012 12:58 AM
> To: talk@openstreetmap.org
> Subject: Re: [OSM-talk] Semi-automated edits - postal code database
> 
> 2012/10/4 Jochen Topf :
> > And, btw, you should not use the associatedStreet relation. It solves
> > the same problem as the addr:street tags on nodes and buildings but in
> > a much more complicated way. The overwhelming majority of all
> > addresses are tagged with addr:street (there are nearly 15 million
> > addr:street tags vs. only 18.000 associatedStreet relations).
> 
> Direct comparison of number of addr:street tags and associatedStreet
> relations is not that simple.
> How many addresses are behind the associatedStreet relations ?

And how many associatedStreets don't have addresses at all?
http://www.openstreetmap.org/browse/relation/2523 doesn't have any members
except for streets.

A more accurate count would be how many relation members have the type house
and are also a member of a relatedStreet relation.

The answer is 1128546 objects. Broken down by object type, this is 656010
nodes, 658 relations and 471878 ways.

So there is about a 13:1 preference in the database for addr:street over
relations.


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-04 Thread Tobias Knerr
On 04.10.2012 14:53, Ed Loach wrote:
> But how many of the 15 million are the results of imports taking the
> easy way of using addr:street? Taginfo lists combinations and we
> have 2.3 million that also have osak: tags, 0.8 million that also
> have kms: tags, then lesser combinations such as uir_adr:ADRESA_KOD,
> usar_addr:edit_date, mvdgis:cod_nombre, chicago:building_id and
> surrey:addrid and that's only got me to page 5 of 519 of the
> combinations.

It is true that probably a lot of these are imports. But this might be
true for both tagging styles, and you also have to account for the JOSM
plugins where the authors decided to automatically create relations.
They don't necessarily set tags like that, so they are harder to filter out.

> Then you have all the people who have used addr:street instead of
> the relation because it seems the more popular option, perhaps only
> because of those imports.

Then you have all the people who believe that "relations are easier to
use for computers" - after all, why would anyone use that confusing
concept otherwise? -, and therefore suffer through them because they
mistakenly believe that it makes their data "better".
Or they think that addr:street is outdated because relations as a whole
are newer than other elements. (I've encountered both of these beliefs.)

Imo, addr:street is more straightforward to understand, makes the common
beginner task of entering or fixing an address much more accessible and
is therefore preferable over relations. The number of uses is hard to
measure, but doesn't really affect these basic arguments anyway.

To me, it's associatedStreet which seems out of place in OSM tagging,
and that's not just because it uses camelCase for some reason. ;)

Tobias

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-04 Thread Ed Loach
> Okay sorry. Worldwide we have about 16 million
> addr:housenumber tags and about
> 15 million addr:street tags. So there is no addr:street for about
1
> mio
> housenumbers.  Presumably thats because they are members in an
> associatedStreet
> relation. (It could also be because it is easy to find the right
street,
> because it is the one next to the house, but lets ignore those
cases.)
> So
> its still less than 10%.

But how many of the 15 million are the results of imports taking the
easy way of using addr:street? Taginfo lists combinations and we
have 2.3 million that also have osak: tags, 0.8 million that also
have kms: tags, then lesser combinations such as uir_adr:ADRESA_KOD,
usar_addr:edit_date, mvdgis:cod_nombre, chicago:building_id and
surrey:addrid and that's only got me to page 5 of 519 of the
combinations.

Then you have all the people who have used addr:street instead of
the relation because it seems the more popular option, perhaps only
because of those imports.

Ed


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-04 Thread Jochen Topf
On Thu, Oct 04, 2012 at 09:58:02AM +0200, Christian Quest wrote:
> 2012/10/4 Jochen Topf :
> > And, btw, you should not use the associatedStreet relation. It solves the 
> > same
> > problem as the addr:street tags on nodes and buildings but in a much more
> > complicated way. The overwhelming majority of all addresses are tagged with
> > addr:street (there are nearly 15 million addr:street tags vs. only 18.000
> > associatedStreet relations).
> >
> 
> Direct comparison of number of addr:street tags and associatedStreet
> relations is not that simple.

Okay sorry. Worldwide we have about 16 million addr:housenumber tags and about
15 million addr:street tags. So there is no addr:street for about 1 mio
housenumbers.  Presumably thats because they are members in an associatedStreet
relation. (It could also be because it is easy to find the right street,
because it is the one next to the house, but lets ignore those cases.) So
its still less than 10%.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-04 Thread Christian Quest
2012/10/4 Jochen Topf :
> And, btw, you should not use the associatedStreet relation. It solves the same
> problem as the addr:street tags on nodes and buildings but in a much more
> complicated way. The overwhelming majority of all addresses are tagged with
> addr:street (there are nearly 15 million addr:street tags vs. only 18.000
> associatedStreet relations).
>

Direct comparison of number of addr:street tags and associatedStreet
relations is not that simple.
How many addresses are behind the associatedStreet relations ?

For example in France, we currently have:
- 27730 associatedStreet relations
- 472941 members with the "house" role
- 395895 on 761051 nodes (52%) and 78541 on 187193 ways (42%) with
"addr:housenumber" in these relations, so a total of 50% of addresses
are in associatedStreet relations.

This is also due to JOSM plugin we use to simplify creating addresses
which automatically takes care of all the associatedStreet relation
stuff.
We also developed quality assurance analysis on our "osmose" tool to
make sure the addresses are coherent (unique addr:number in one
relation, unique relation for one addr:street in a town, limited
distance between addr:housenumber nodes/ways and the street highway,
etc).

-- 
Christian Quest - OpenStreetMap France - http://openstreetmap.fr/u/cquest

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Semi-automated edits - postal code database

2012-10-03 Thread Jochen Topf
Hi!

On Wed, Oct 03, 2012 at 11:10:05AM +, Svavar Kjarrval wrote:
> I'm trying to find a good method to maintain data from outside sources.
> The data in question is the Icelandic postal code database (which they
> say we may use freely). My searches on the OSM wiki have been fruitless
> so far.
> 
> The idea is to maintain the data in associatedStreet relations. Each
> relation has a tag called 'götuskrá:id' which value is a direct
> reference to the row ID in the files we retrieve from the postal
> company's website. The file formats available are CVS and XML 1.0. The
> script would presumably go ever each associatedStreet relation and make
> any changes (if appropriate) when a götuskrá:id tag is found. The output
> could be an OSM change file loaded into an editor like JOSM to be
> uploaded manually. Maybe an automated process later when we're confident
> that everything is done correctly, and of course after submitting the
> script(s) for review by the local community.

It is not a good idea to add some random ID of your favourite database to
OSM, because nobody except you can understand this ID and do useful things
with it. It just confuses mappers and make it more difficult to edit the
data. For every change somebody does to the data they have to know what this
tag means so that they can properly do their edit. And if they don't, people
will just mess up your data and you will not be able to use this ID for
syncing the data anyways.

And in this case I don't even see why you need it. You have street names and
postal codes in both OSM and the Icelandic postal code database. If something
changes you can find out which combinations changed and apply those changes
to OSM easily just based on the postal code and street name. There is no
need for those IDs.

And, btw, you should not use the associatedStreet relation. It solves the same
problem as the addr:street tags on nodes and buildings but in a much more
complicated way. The overwhelming majority of all addresses are tagged with
addr:street (there are nearly 15 million addr:street tags vs. only 18.000
associatedStreet relations).

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


[OSM-talk] Semi-automated edits - postal code database

2012-10-03 Thread Svavar Kjarrval
Hi.

I'm trying to find a good method to maintain data from outside sources.
The data in question is the Icelandic postal code database (which they
say we may use freely). My searches on the OSM wiki have been fruitless
so far.

The idea is to maintain the data in associatedStreet relations. Each
relation has a tag called 'götuskrá:id' which value is a direct
reference to the row ID in the files we retrieve from the postal
company's website. The file formats available are CVS and XML 1.0. The
script would presumably go ever each associatedStreet relation and make
any changes (if appropriate) when a götuskrá:id tag is found. The output
could be an OSM change file loaded into an editor like JOSM to be
uploaded manually. Maybe an automated process later when we're confident
that everything is done correctly, and of course after submitting the
script(s) for review by the local community.

I can make the script myself in Python if neccessary but decided to find
out if somebody has already done all the work before.

With regards,
Svavar Kjarrval



signature.asc
Description: OpenPGP digital signature
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk