Dear John,
I'm glad that you got into discussion. The OpenStreetMap community has
some consensus that look ouright nonsense from a computer scientist or
programmiers usual point of view. So it is helpful to explain every now
and then what is common sense, checking whether those decisions are
still valid.
Consistent data is useful and typos and mistakes are common place.
Unifying these so they are machine readable so they are useful is, in
fact, useful.
Just some examples:
We have streets with housenumbers 3, 5, 9, 7, 11, 13
Is it an obvious mistake? It's on purpose, because the housenumbers
sometimes are in that order on the ground.
We have in Germany cities with a street named "Cäcilienstraße" and
others with a street named "Cecilienstraße" (both with exactly the same
pronounciation, and both variants of the same surname).
The literal translation of connecting way into German is
"Verbindungsweg". This is also the offical name of a living street in
Siegburg, Germany.
By contrast, for good reason not connected in the database are these roads:
http://blog.openstreetmap.de/blog/2013/05/wochennotiz-nr-147/
There was an automatic bot changing road names ending in "...strasse" to
"...straße" (means "... street" in German, second is the standard
spelling). This did fail both in Switzerland (where "...strasse" is the
authorized spelling) and on the name "Gleistrasse", which means "railway
track right of way" and only contains conincidentially the substring
"strasse".
There are probably more examples. They don't leave much space for
"obvious corrections" that are without doubt justified. That's why the
rule exists that mechanical edits are accepted unless somebody complains:
If nobody complains then the edit was a posteriori a correction of the
obvious. We have no a priori criterion for "obvious correction".
The "rules" for mechanical edit are frankly ridiculous. Have you read them.
Our most valuable resource is not data but people who curate their share
of data. Changing data in a way that might be considered harmful or is
unintentionally outright wrong may shy away those who keep the data current.
The sometimes rude feedback was identified as a probable cause for
OpenStreetMap having few contributing women.
So correcting those obvious errors requires communication with the
mappers (male, female, or else) who have made these errors, in a way
that always at first encourages them to carry on mapping (hopefully with
less mistakes).
On the other hand, a mechanical change of data can be performed as easy
during postprocessing than in the database. This is known in programming
in "don't store an information when it is easier to recompute it".
You may earn real fame if you have a good filtering ruleset that
flatirons all suspect data. If you publish this as a postprocessing
script, it is useful. If you apply that to flatiron the database, in 99%
justified cases and 1% on otherwise on purpose crafted data, then you
will earn shame instead, because that same script could be perceived as
doing vandalism.
It's potentially feasible to postprocess data. It's hard to collect
data. So please don't make collecting data harder. Please make rather
postprocessing data easier.
Best regards,
Roland
_______________________________________________
talk mailing list
talk@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk