Re: [Talk-us] Imports information on the wiki

2012-01-02 Thread Serge Wroclawski
On Mon, Jan 2, 2012 at 6:22 PM, Frederik Ramm  wrote:

> I think that even imports that are well executed *technically* are usually
> bad because they worsen the ratio of "mapper hours available to maintain
> data" to "amount of data requiring maintenance".
>
> Imports should only be allowed if there is a realistic expecation that the> 
> presence of the imported data will lead to a growth in our community of
> about the number of people that would have been required to survey the
> imported data in the first place.

I think in this case, Martijn is reacting to the sheer number of half
completed imports and fixbots in the US that have left areas half
completed or half-right.

It would be easy to say "Manually fix the data", but I can tell you
from experience that going around and manually fixing "Rd" to Road is
not fun, and can, with the TIGER imports, be done safely (by looking
at other tags and being careful with the expansion regex). This has
been done for part of the country, but not the other part.

Similarly, here in the mid-Atlantic region, we have several imports
which have been both not-complete and done twice. I've spent many
hours manually examining two polygons of the same geometry (some which
share the same nodes, others which do not) only to remove one.

Having users do these kind of operations adds nothing but "busywork",
and is error prone (in the second case, I've removed both sets of
polygons by accident, for example).

I think Martijn's focus on cleaning up the imports, especially in the
US, should be welcome and encouraged.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Imports information on the wiki

2012-01-02 Thread Frederik Ramm

Hi,

On 01/03/2012 12:01 AM, Mike N wrote:

Mostly because of customary resistance to automatic imports, which is
rooted in bad imports.


I think that even imports that are well executed *technically* are 
usually bad because they worsen the ratio of "mapper hours available to 
maintain data" to "amount of data requiring maintenance".


Imports should only be allowed if there is a realistic expecation that 
the presence of the imported data will lead to a growth in our community 
of about the number of people that would have been required to survey 
the imported data in the first place.


Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Imports information on the wiki

2012-01-02 Thread Mike N



 > What in particular were the complaints made against the fixbot?


  Here's a link to the full thread:

http://lists.openstreetmap.org/pipermail/talk-us/2010-April/thread.html

 See "Steet Naming Conventions" threads.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Imports information on the wiki

2012-01-02 Thread Mike N

Continued from tagging...

On 1/2/2012 11:40 AM, Martijn van Exel wrote:
> What makes you think that the current common opinion is against
> automatic expansion?

  Mostly because of customary resistance to automatic imports, which is 
rooted in bad imports.   Josh has noted in Tagging that it could give a 
false impression of quality to others - I don't agree because I've never 
considered an expanded name a measure of quality or synonymous with the 
removal of the "tiger:reviewed=no" flag.


> What in particular were the complaints made against the fixbot?
  Some of it was by finding the few obscure cases that don't yield to 
automation: The city of St Louis has several cases of

  "St Louis Street"
  "Saint Louis Street"
 style duplication.   The streets are physically not connected and you 
must use the correct naming convention to go to the right place.

Also -  "St Park Rd"; is this "Saint Park Road" or "State Park Road"?


> Can/should the fixbot be improved to take them into account?
   I believe we could let it run if everyone agrees on the safe expansions.
 * Rd ->* Road for example.

  And despite it being a bot, the author put in some significant block 
of time to review each upload for correctness and marked many errors 
that were clearly rooted in TIGER data.


 There is some minor discrepancy between TIGER abbreviations and common 
street sign / official USPS abbreviations also: Pky -> Pkwy = Parkway


> especially if a fixbot was
> only run on half the country, as was the case with the name expansion.

 Ironically, having a split country situation forces data consumers to 
handle both the abbreviated and expanded case (Mainly Nominatum today). 
  Even if the entire US data is expanded, that situation will continue 
to exist as new mappers arrive and have never dealt with anything but 
abbreviations on maps, street signs, or addresses.   They may even 
re-abbreviate expanded names.   Otherwise, we would need bots to run 
behind them and clean up new contributions to make them usable.  Or 
waste other mapper's time to do it manually.   But manually entered road 
names cannot be automatically expanded, since those will very seldom if 
ever have directional  hints like TIGER data has.


  Nominatum should still continue to handle both cases.  Although it 
has been said that map renderering is the place to abbreviate the full 
name, no map renderer currently does this as far as I know.  If there is 
ever a custom US style OSM renderer, that would be a logical feature.



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Imports information on the wiki

2012-01-02 Thread Martijn van Exel
2012/1/2 Matthias Meißer :
> Am 02.01.2012 20:05, schrieb Martijn van Exel:
>
[...]
>>
> Hi Martijn,
>
> thanks for setting this up. Personaly I would recommend to use the Imports
> mainpage, as so everybody else can find them. Even /subpages isn't that wise
> (but I was a fan of it, too ;)) as it makes it very hard to place links and
> tries to build up hirachies. But a wiki is a ontology and therefore works
> with categories :)

The reason for doing it on a separate page is that it becomes more
manageable. The imports page is huge and can be hard to both edit and
consume for that reason.
By categories, do you mean similar to the 'Users in ...' categories?
That would make sense to me, but would not require to maintain the
Imports page. We could then tag each import description page with
categories [Import] [Country] or [Fixbot] [Country]. I don't know if
mediawiki knows about administrative boundary hierarchies - if so you
could have [Import] [San_Diego] or similar and still have the import
show up in the auto-generated list of 'Imports in the US'.

I like this approach in general, but just having an auto-generated
list of imports and fixbots without any context is not enough. The
page would need to provide some context, especially for those external
to OSM seeking information. I don't know if / how that could be
achieved?

-- 
martijn van exel
geospatial omnivore
1109 1st ave #2
salt lake city, ut 84103
801-550-5815
http://oegeo.wordpress.com

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Imports information on the wiki

2012-01-02 Thread Matthias Meißer

Am 02.01.2012 20:05, schrieb Martijn van Exel:

Hi,

There's a lot of imports / fixbots / scripts that have affected the US
data over the years. I am finding this out slowly by slowly, asking
questions here and on IRC sometimes. The Road Name Expansion script is
a recent example that was discussed on the tagging list and here.
Another is that there are actually two different GNIS imports. This
makes it hard for mappers to focus their efforts when they first start
out. Also (and as importantly), external data consumers will have a
really hard time processing and using OSM data if information on
imports and fixbots is not consolidated - especially if a fixbot was
only run on half the country, as was the case with the name expansion.

I propose a wiki editing effort to consolidate information on scripts
and fixbots run in the US. I know that there is a global Import
Catalog page already, and this would partially duplicate the
information found there. I think it makes sense to do it anyway
because:
* The US is a specific and important market for many potential data
consumers, and they need this info in one place
* The US data is particularly strongly shaped by imports and automated edits.

I have made a start here:
http://wiki.openstreetmap.org/wiki/WikiProject_United_States/Imports_And_Automated_Edits
This page would replace or be integrated into the current Data page
http://wiki.openstreetmap.org/wiki/WikiProject_United_States/Data - I
did not want to overhaul that page just yet without engaging in this
discussion first. So:
* US imports and fixbots page: good idea?
* Format of the page: good? Maybe a table format? Include more
information? If so, which?
* Please add / complete / modify and most importantly discuss!


Hi Martijn,

thanks for setting this up. Personaly I would recommend to use the 
Imports mainpage, as so everybody else can find them. Even /subpages 
isn't that wise (but I was a fan of it, too ;)) as it makes it very hard 
to place links and tries to build up hirachies. But a wiki is a ontology 
and therefore works with categories :)


bye
Matthias
(user:!i!)



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


[Talk-us] Imports information on the wiki

2012-01-02 Thread Martijn van Exel
Hi,

There's a lot of imports / fixbots / scripts that have affected the US
data over the years. I am finding this out slowly by slowly, asking
questions here and on IRC sometimes. The Road Name Expansion script is
a recent example that was discussed on the tagging list and here.
Another is that there are actually two different GNIS imports. This
makes it hard for mappers to focus their efforts when they first start
out. Also (and as importantly), external data consumers will have a
really hard time processing and using OSM data if information on
imports and fixbots is not consolidated - especially if a fixbot was
only run on half the country, as was the case with the name expansion.

I propose a wiki editing effort to consolidate information on scripts
and fixbots run in the US. I know that there is a global Import
Catalog page already, and this would partially duplicate the
information found there. I think it makes sense to do it anyway
because:
* The US is a specific and important market for many potential data
consumers, and they need this info in one place
* The US data is particularly strongly shaped by imports and automated edits.

I have made a start here:
http://wiki.openstreetmap.org/wiki/WikiProject_United_States/Imports_And_Automated_Edits
This page would replace or be integrated into the current Data page
http://wiki.openstreetmap.org/wiki/WikiProject_United_States/Data - I
did not want to overhaul that page just yet without engaging in this
discussion first. So:
* US imports and fixbots page: good idea?
* Format of the page: good? Maybe a table format? Include more
information? If so, which?
* Please add / complete / modify and most importantly discuss!

-- 
martijn van exel
geospatial omnivore
1109 1st ave #2
salt lake city, ut 84103
801-550-5815
http://oegeo.wordpress.com

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us