Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Nathan Edgars II
The process seems obvious to me: check that the name is still what it 
originally was (from the tiger:name_base etc. tags), and if so, use 
those tags to expand abbreviations. (Ignore any with semicolons/colons 
from joining.) If not, set it aside for semi-manual checking. The only 
false positives that are not errors in the TIGER data will be caused by 
someone changing the tiger tags, and if both these and the name were 
changed consistently, the editor probably knew what they were doing.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Anthony
On Sat, May 12, 2012 at 4:47 PM, Anthony  wrote:
> On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski  wrote:
>> It checks suffixes starting from the
>> end, so if you have "St something St E" or "St something St East",
>> it'll only check "E" or "East" and then "St" and then stop because
>> "something" is not a known suffix.
>
> So "Calle Ave Maria" will be expanded to "Calle Avenue Maria"?

Nevermind.  No.  It won't.  Because Maria is not a known suffix, right?

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Anthony
On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski  wrote:
> It checks suffixes starting from the
> end, so if you have "St something St E" or "St something St East",
> it'll only check "E" or "East" and then "St" and then stop because
> "something" is not a known suffix.

So "Calle Ave Maria" will be expanded to "Calle Avenue Maria"?

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Anthony
On Sat, May 12, 2012 at 4:21 PM, andrzej zaborowski  wrote:
> On 11 May 2012 22:17, Dale Puch  wrote:
>> I understand the script checks for only one instance of the abbreviation.
>> My point was what is someone manually expanded ONE of the abbreviations,
>> leaving "st something street"?  Is that checked for?  The question also
>> applies to "Dr something Dr" previously changed to "Dr something Drive", and
>> possibly directionals as well.  Serge seems to be doing a good job with
>> this, and this is just feedback so there aren't any incorrect expansions.
>
> The way the old script deals with those, is it has a list of
> abbreviations that come as a suffix and those that come as a prefix,
> from the TIGER documentation.

What about "Avenue", which comes as both?

> It checks suffixes starting from the
> end, so if you have "St something St E" or "St something St East",
> it'll only check "E" or "East" and then "St" and then stop because
> "something" is not a known suffix.

How does it handle "Avenue N" and "Avenue S"?

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread andrzej zaborowski
On 11 May 2012 22:17, Dale Puch  wrote:
> I understand the script checks for only one instance of the abbreviation.
> My point was what is someone manually expanded ONE of the abbreviations,
> leaving "st something street"?  Is that checked for?  The question also
> applies to "Dr something Dr" previously changed to "Dr something Drive", and
> possibly directionals as well.  Serge seems to be doing a good job with
> this, and this is just feedback so there aren't any incorrect expansions.

The way the old script deals with those, is it has a list of
abbreviations that come as a suffix and those that come as a prefix,
from the TIGER documentation.  It checks suffixes starting from the
end, so if you have "St something St E" or "St something St East",
it'll only check "E" or "East" and then "St" and then stop because
"something" is not a known suffix.

There are cases where something can be both a suffix and a prefix, but
those cases are known from the TIGER documentation.

Note that that "St something St", can be "Saint something St", but it
can also be "State something St".  The script uses a list of things
that can be saint and those that can be state owned.

Cheers

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Werner Poppele

Dale Puch wrote:
Lots of weird ones from Florida  Many should not give you an issue due 
to how your processing, but it is best to test them anyhow.  Also it 
might be a good reference when looking at other expansions after this 
runs.

way id="10761946"  "name" v="E 10th Ct E"
way id="10763539"  "name" v="E 10th St E"
way id="10759486"  "name" v="E 14th Pl E"
way id="11018453"  "name" v="E 1st Avenue Pl" <-- not really a 
problem, just... odd
way id="10763214"  "name" v="E 40th Pz  E" <-- Note the double space 
before E

way id="10966845"  "name" v="E Camp N Comfort Ln" <-- Non directional N
way id="11210989"  "name" v="E Canal St N"
way id="10967404"  "name" v="E Dr"
way id="10974755"  "name" v="E Dr Martin Luther King Jr Blvd"
way id="11278916"  "name" v="E H St E"
way id="10965707"  "name" v="E Ln"
way id="11242732"  "name" v="E Martin Luther King Jr Dr"
way id="11102139"  "name" v="E Pl"
way id="10959109"  "name" v="E St Andrews Dr"
way id="10827576"  "name" v="E St James Loop" <-- I guess Tiger did 
not abbreviate loop

way id="11272826"  "name" v="E St Johns St"
way id="11021472"  "name" v="E St Louis Ave"
way id="11065801"  "name" v="E W Reeves Rd"
way id="103599461"  "name" v="E. Watson Road" <-- Not a tiger import
way id="10983188"  "name" v="East North Street" <-- already expanded tiger
way id="11270447"  "name" v="East North St" <-- not expanded
way id="11274418"  "name" v="Edwin St N  E"
way id="10851149"  "name" v="Egret's Walk Cir S" <-- In case the 
's causes problems

way id="10808177"  "name" v="Ellesmere  E"
way id="10951424"  "name" v="Ave del Ctr"
way id="11288799"  "name" v="Avenue E  N"
way id="10939680"  "name" v="Avenue N"
way id="11285084"  "name" v="Avenue N  NW"
way id="11097378"  "name" v="Dr"
way id="10812824"  "name" v="Dr Faruqui Dr"
way id="11358527"  "name" v="Dr Joe Abal Dr"
way id="10919692"  "name" v="Dr Martin L King Jr Dr"
way id="11128816"  "name" v="N 14th St Pl"
way id="10982651"  "name" v="N 19th Cir SW"
way id="39488514"  "name" v="N 22nd St." <-- non tiger
way id="10885972"  "name" v="N 3rd Street Cir"
way id="10993673"  "name" v="N Blvd"
way id="10807124"  "name" v="N Cortez Dr Cir C"
way id="11371860"  "name_1" v="N Cswy" <-- "name" v="N Causway"
way id="11090351"  "name" v="N E 144th Avenue Rd"
way id="11080981"  "name" v="N E 238 Ave Rd"
way id="11089629"  "name" v="N E 62nd Ct Rd"
way id="10927659"  "name" v="N E St"
way id="11013343"  "name" v="N F S 595-2"
way id="10925619"  "name" v="N N St"
way id="11359562"  "name" v="N N Road"
way id="10921209"  "name" v="N S St"
way id="10880720"  "name" v="N St Andrews St"
way id="10765917"  "name" v="N St Clair St"
way id="10979914"  "name" v="N St Peter St"
way id="11302478"  "name" v="N Swan Ct NE"
way id="10243562"  "name" v="N W 34th St R"
way id="11092219"  "name" v="N W 51 St Ct"
way id="10927760"  "name" v="N W Ave F North"
way id="10763701"  "name" v="N de Gama Ave N"
way id="26630760"  "name" v="N orth22nd Street" <--bad manual edit
way id="27354570"  "name" v="N orthGarcia Avenue" <--bad manual edit
way id="10754189"  "name" v="N-Yellow Pine Cir" <-- "name_1" v="Yellow 
Pine North Cir"

way id="119723334"  "name" v="N. Shingle Lane" <-- non tiger
way id="10983026"  "name" v="N19th Ave" <--  "tiger:name_base" 
v="111th"  Probably due to edits
way id="11058140"  "name" v="NE 40 Ln" <--  "name_1" v="NE 1 St Ave"  
Version 1 tiger
way id="10806770"  "name" v="NE 16th Ter; NE 17th Ave" <-- double name 
possibly from edits

way id="11079312"  "name" v="NE 172 Ave Rd"
way id="11089303"  "name" v="NE 18th Ave; NE 9th St" <-- double name 
possibly from edits
way id="10800930"  "name" v="NE 19th Ter; NE 25th St" <-- double name 
possibly from edits

way id="11100990"  "name" v="NE 196 Ter Rd"
way id="11099492"  "name" v="NE 21st Ter W"
way id="11088248"  "name" v="NE 220th Ave Rd"
way id="11062349"  "name" v="NE 3 Rd Ave"
way id="11081124"  "name" v="NE 36th Av Rd"
way id="11070763"  "name" v="NE Mt Zion A M E Church Ave"
way id="11081908"  "name" v="NE226 Ter"
way id="28931406"  "name" v="NE31st Ave" <-- non tiger
way id="10789444"  "name" v="NE
way id="10788734"  "name" v="NW 10th St Access Rd"
way id="10788581"  "name" v="NW 126th Ave; NW 126th Way"
way id="10242655"  "name" v="NW 141st"
way id="10242241"  "name" v="NW 181 St"
way id="11128828"  "name" v="NW 181st St"
way id="11085308"  "name" v="NW 21st Street"
way id="11082282"  "name" v="NW 221st Street Rd"
way id="10765627"  "name" v="NW 231 St"
way id="11151648"  "name" v="NW 4th Avenue Cir E"
way id="10792992"  "name" v="NW 6th Ave; Blanch Ely Ave"
way id="10809778"  "name" v="NW 71st Pl; NW 71st St"
way id="10928777"  "name" v="NW Avenue G; Avenue G North; NW Avenue G"
way id="11273744"  "name" v="NW Dr"
way id="107757877"  "name" v="NW NW 125th Avenue" <-- non tiger
way id="10246730"  "name" v="NW30Ln" <-- name1 has spaces
way id="11065133"  "name" v="National Forest Rd 141A"
way id="11060010"  "name" v="Nf Rd 354"
way id="11083729"  "name" v="Nfr 75B"

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-12 Thread Nathan Edgars II

On 5/12/2012 2:16 AM, Dale Puch wrote:

way id="11013343" "name" v="N F S 595-2"


National Forest Service Road 595-2. There are so many different ways of 
abbreviating this, and I'm not sure which expansion is most correct, so 
I'd leave these alone.


One thing you won't find much of in Florida (since I've fixed most) is 
abbreviations of County Road/Route/Highway. Stranger ones include Creek 
(bad expansion of Cr, abbreviation for County Road) and Cord (typo for 
Co Rd). Both of these originate with TIGER, not subsequent expansions:


http://www.openstreetmap.org/browse/way/11071477/history name="SE Creek 255"
http://www.openstreetmap.org/browse/way/11071726/history name="NE Cord255"

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Dale Puch
Lots of weird ones from Florida  Many should not give you an issue due to
how your processing, but it is best to test them anyhow.  Also it might be
a good reference when looking at other expansions after this runs.
way id="10761946"  "name" v="E 10th Ct E"
way id="10763539"  "name" v="E 10th St E"
way id="10759486"  "name" v="E 14th Pl E"
way id="11018453"  "name" v="E 1st Avenue Pl"  <-- not really a problem,
just... odd
way id="10763214"  "name" v="E 40th Pz  E"  <-- Note the double space
before E
way id="10966845"  "name" v="E Camp N Comfort Ln"  <-- Non directional N
way id="11210989"  "name" v="E Canal St N"
way id="10967404"  "name" v="E Dr"
way id="10974755"  "name" v="E Dr Martin Luther King Jr Blvd"
way id="11278916"  "name" v="E H St E"
way id="10965707"  "name" v="E Ln"
way id="11242732"  "name" v="E Martin Luther King Jr Dr"
way id="11102139"  "name" v="E Pl"
way id="10959109"  "name" v="E St Andrews Dr"
way id="10827576"  "name" v="E St James Loop"  <-- I guess Tiger did not
abbreviate loop
way id="11272826"  "name" v="E St Johns St"
way id="11021472"  "name" v="E St Louis Ave"
way id="11065801"  "name" v="E W Reeves Rd"
way id="103599461"  "name" v="E. Watson Road" <-- Not a tiger import
way id="10983188"  "name" v="East North Street"  <-- already expanded tiger
way id="11270447"  "name" v="East North St" <-- not expanded
way id="11274418"  "name" v="Edwin St N  E"
way id="10851149"  "name" v="Egret's Walk Cir S"  <-- In case the
's causes problems
way id="10808177"  "name" v="Ellesmere  E"
way id="10951424"  "name" v="Ave del Ctr"
way id="11288799"  "name" v="Avenue E  N"
way id="10939680"  "name" v="Avenue N"
way id="11285084"  "name" v="Avenue N  NW"
way id="11097378"  "name" v="Dr"
way id="10812824"  "name" v="Dr Faruqui Dr"
way id="11358527"  "name" v="Dr Joe Abal Dr"
way id="10919692"  "name" v="Dr Martin L King Jr Dr"
way id="11128816"  "name" v="N 14th St Pl"
way id="10982651"  "name" v="N 19th Cir SW"
way id="39488514"  "name" v="N 22nd St." <-- non tiger
way id="10885972"  "name" v="N 3rd Street Cir"
way id="10993673"  "name" v="N Blvd"
way id="10807124"  "name" v="N Cortez Dr Cir C"
way id="11371860"  "name_1" v="N Cswy" <-- "name" v="N Causway"
way id="11090351"  "name" v="N E 144th Avenue Rd"
way id="11080981"  "name" v="N E 238 Ave Rd"
way id="11089629"  "name" v="N E 62nd Ct Rd"
way id="10927659"  "name" v="N E St"
way id="11013343"  "name" v="N F S 595-2"
way id="10925619"  "name" v="N N St"
way id="11359562"  "name" v="N N Road"
way id="10921209"  "name" v="N S St"
way id="10880720"  "name" v="N St Andrews St"
way id="10765917"  "name" v="N St Clair St"
way id="10979914"  "name" v="N St Peter St"
way id="11302478"  "name" v="N Swan Ct NE"
way id="10243562"  "name" v="N W 34th St R"
way id="11092219"  "name" v="N W 51 St Ct"
way id="10927760"  "name" v="N W Ave F North"
way id="10763701"  "name" v="N de Gama Ave N"
way id="26630760"  "name" v="N orth22nd Street"  <--bad manual edit
way id="27354570"  "name" v="N orthGarcia Avenue"  <--bad manual edit
way id="10754189"  "name" v="N-Yellow Pine Cir"  <-- "name_1" v="Yellow
Pine North Cir"
way id="119723334"  "name" v="N. Shingle Lane"  <-- non tiger
way id="10983026"  "name" v="N19th Ave"  <--  "tiger:name_base" v="111th"
Probably due to edits
way id="11058140"  "name" v="NE 40 Ln"  <--  "name_1" v="NE 1 St Ave"
Version 1 tiger
way id="10806770"  "name" v="NE 16th Ter; NE 17th Ave"  <-- double name
possibly from edits
way id="11079312"  "name" v="NE 172 Ave Rd"
way id="11089303"  "name" v="NE 18th Ave; NE 9th St"  <-- double name
possibly from edits
way id="10800930"  "name" v="NE 19th Ter; NE 25th St"  <-- double name
possibly from edits
way id="11100990"  "name" v="NE 196 Ter Rd"
way id="11099492"  "name" v="NE 21st Ter W"
way id="11088248"  "name" v="NE 220th Ave Rd"
way id="11062349"  "name" v="NE 3 Rd Ave"
way id="11081124"  "name" v="NE 36th Av Rd"
way id="11070763"  "name" v="NE Mt Zion A M E Church Ave"
way id="11081908"  "name" v="NE226 Ter"
way id="28931406"  "name" v="NE31st Ave"  <-- non tiger
way id="10789444"  "name" v="NE
way id="10788734"  "name" v="NW 10th St Access Rd"
way id="10788581"  "name" v="NW 126th Ave; NW 126th Way"
way id="10242655"  "name" v="NW 141st"
way id="10242241"  "name" v="NW 181 St"
way id="11128828"  "name" v="NW 181st St"
way id="11085308"  "name" v="NW 21st Street"
way id="11082282"  "name" v="NW 221st Street Rd"
way id="10765627"  "name" v="NW 231 St"
way id="11151648"  "name" v="NW 4th Avenue Cir E"
way id="10792992"  "name" v="NW 6th Ave; Blanch Ely Ave"
way id="10809778"  "name" v="NW 71st Pl; NW 71st St"
way id="10928777"  "name" v="NW Avenue G; Avenue G North; NW Avenue G"
way id="11273744"  "name" v="NW Dr"
way id="107757877"  "name" v="NW NW 125th Avenue"  <-- non tiger
way id="10246730"  "name" v="NW30Ln"  <-- name1 has spaces
way id="11065133"  "name" v="National Forest Rd 141A"
way id="11060010"  "name" v="Nf Rd 354"
way id="11083729"  "name" v="Nfr 75B"
way id="11034257"  "na

Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Serge Wroclawski
On Fri, May 11, 2012 at 4:17 PM, Dale Puch  wrote:
> I understand the script checks for only one instance of the abbreviation.

> My point was what is someone manually expanded ONE of the abbreviations,
> leaving "st something street"?  Is that checked for?

I have a number of thoughts here:

1.  Real world examples.

Many of the examples I've seen are contrived. I'm glad we're testing,
but testing needs to be based on actual data seen in the US dataset.

That said:

2. There are a couple of ways to handle this:

* One way (the most conservative way) would be to test for untouched
TIGER ways. That is ways in which they're still at version 1. This
would be a real problem, though, since there are lots of examples were
someone may have fixed the geometry without touching the tags.

* The other way is a method I'm using in an experimental branch of the
code on my machine, which is to try to be a bit more selective about
the expansions of road types. If we assume that the road type always
appears after the base name, we can be handle examples like (real
world example) "St Marys St". The same would hold true for direction
tags, so we'd be able to expand "E E St" confidently as well.

But there's a catch. If someone would have edited the name of the
above street from the original "St Marys St" to "St. Marys St" then
that test would fail, and the expansion would never occur, where as in
the current version, it would.

So:

3. Any method used is going to produce some number of potential either
false positives or false negatives. I contend that the number of
errors in either case will be so tiny that it will be lost in the
noise, but there's no way to promise it will always be 0. The best we
can do is toss out uncertain expansions and have them handled manually
(which is something I'm working to make better in the next version of
the code as well).

But:

4. I don't want us to rely on cleverness. I'd much rather rely on
people testing the code with real world inputs and checking the
outputs.


I should have a new version of the code either tonight or tomorrow,
with the new expansion rules.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Mike N

On 5/11/2012 5:54 PM, Alan Mintz wrote:

  Mapping for the renderer has never been wrong or discouraged.
Tagging incorrectly for the renderer is another story...


That is not my recollection.


 Here's the page, complete with history -

http://wiki.openstreetmap.org/wiki/Tagging_for_the_renderer


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz

At 2012-05-11 14:11, Mike N wrote:

On 5/11/2012 1:36 PM, Alan Mintz wrote:

Okay, so basically we're ignoring the on-the-ground rule in order to
map for the renderer.


Exactly :) Why that is ok, I don't know :(


  Mapping for the renderer has never been wrong or discouraged. Tagging 
incorrectly for the renderer is another story...


That is not my recollection.

--
Alan Mintz 


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Mike N

On 5/11/2012 1:36 PM, Alan Mintz wrote:

Okay, so basically we're ignoring the on-the-ground rule in order to
map for the renderer.


Exactly :) Why that is ok, I don't know :(


  Mapping for the renderer has never been wrong or discouraged. 
Tagging incorrectly for the renderer is another story...


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Dale Puch
I understand the script checks for only one instance of the abbreviation.
My point was what is someone manually expanded ONE of the abbreviations,
leaving "st something street"?  Is that checked for?  The question also
applies to "Dr something Dr" previously changed to "Dr something Drive",
and possibly directionals as well.  Serge seems to be doing a good job with
this, and this is just feedback so there aren't any incorrect expansions.

On Fri, May 11, 2012 at 12:27 AM, Toby Murray  wrote:

> On Thu, May 10, 2012 at 10:52 PM, Dale Puch  wrote:
> > I think I came up with a rare possibility for error.
> > The original "st something st" was manually expanded to "st something
> > street"  your checking for a single st, and there would be.  Or am I
> missing
> > another check?
>
> It checks for one and ONLY one possible abbreviation to expand. If
> there are more than one it punts and ignores the way. This is a very
> conservative approach which is probably good at least for a first
> pass. Maybe if the first run goes well we can see how many problems
> are left and look at refining things for a second pass to catch more
> difficult ones. Or not...
>
> Toby
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz

At 2012-05-11 10:20, David ``Smith'' wrote:
Third, I suggest retaining the abbreviated form in a tag like abbr_name. 
Ideally, this should be the exact abbreviated form used on signs, if 
that's consistent.  Getting this right requires local knowledge, but 
TIGER's abbreviation might be better than nothing. I'm sure some will 
disagree with that point.


Better yet, since a proper expansion bot has to chop up the name into its 
components, why not take the opportunity to advance the project by tagging 
(and re-abbreviating if necessary) those individual components (e.g. 
street:dir_prefix, street:name, street:type, street:dir_suffix)? That, I 
could support. One field with the full name for the text-to-speech 
consumers, and another set of fields to properly identify the street the 
way others do.



Fifth, renderers must take care in abbreviating street names. For example, 
Mapquest Open turns Lane Avenue into Ln Ave, where only the last word 
should be abbreviated. To eliminate guesswork, renderers can use the 
abbr_name tag, if present.


Wouldn't happen with street:name=Lane, street:type=Ave (since it would not 
speak street:name verbatim)


--
Alan Mintz 


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz

At 2012-05-10 20:45, Dale Puch wrote:
The issue with abbreviations is
very muddy.  BUT it has been said many time that we do not want to
abbreviate where possible.  There are several reasons.  

Clarity!  The abbreviations are just that, they mean the full
word, and are spoken that way, but written and displayed as the
abbreviation.


Since the cities and postal services publish the abbreviated names in
documents and on signs, and people use they routinely that way in their
own writing, and (in my experience) rarely speak the suffix except where
it matters (in UT, DC, Atlanta), I disagree that expanding them adds any
clarity. Everyone knows what they mean. If anything, we're making an
assumption by expanding them, since we don't "know" that a
leading Dr means Doctor, or St means Saint - we can only assume. I think
we should tag what's on the ground and leave it to the specific case of
accessibility and navigation tools to expand as necessary - this is the
way commercial tools already work, and it's not hard for them to
do.


It is a LOT easier to abbreviate from the full word than to go the
other way.  Otherwise this scripting expansion thing would be easy
and error free.


Slightly easier, maybe. Significantly - I don't think so. I think people
aren't thinking about the problem very well, honestly. The fact that
"St Something St" keeps coming up as being ambiguous is silly
to me. Obviously, there's a difference between the meaning of St in front
and in back. To design an sort of expansion logic without that basic
concept is silly.


As mentioned it makes use of the data easier, especially for
searching, and text to speech. 


Searching? Why? People are more likely to search for Something Blvd than
Something Boulevard. Text to speech, yes, per above.
There are reasons commercial databases and maps have separate fields for
number, direction prefix, name, type, and direction suffix, and use
standard abbreviations in all but the number and name field. I don't know
why we are trying to re-invent the wheel as an octagon (or maybe square
:) ).

Yes there can be errors with going
from abbreviations to the full words.  A reason for doing this as
said, do it once with review instead of in every program that uses the
data.  
Why not just publish hashes of the correct expansions in each language
for the consumers to use. Soon after they start to be used, they will
rarely change.

But those errors are small in
comparison to the number of abbreviated way names and can be corrected
later as found just like any other tagging error.  Most of those
errors are going to be on names that are unclear to begin
with.
They are _very_ hard to find. I still stumble across balrog-kun expansion
errors, years later. And I look at a _lot_ of street names in detail
compared to the average mapper.

--
Alan Mintz 



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 1:35 PM, Alan Mintz
 wrote:
> At 2012-05-10 19:40, Anthony wrote:
>>
>> On Thu, May 10, 2012 at 10:25 PM, Mike N  wrote:
>>
>> >  The only question is what to do about those cases where it's only
>> > referred
>> > to locally as 'Ave', and the postal service would refuse letters
>> > addressed
>> > to 'Avenue'.
>>
>> The postal service would refuse letters addressed to "Avenue" in some
>> instances?
>
>
> Unless this quote is out of context, that seems ridiculous (in the US).

I very well may have misquoted Mike North.  I'm not sure what he was
trying to say.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 12:26 PM, Minh Nguyen  wrote:
> On 2012-05-11 6:45 AM, Anthony wrote:
>> The only way to capture the full information is to have additional
>> tags telling you what the base is.  And if you do that, abbreviating
>> or not abbreviating doesn't matter.
>
>
> That's similar to how the tiger:* tags are structured, and it's the subject
> of a proposal on the wiki:
>
> http://wiki.openstreetmap.org/wiki/Proposed_features/Directional_Prefix_%26_Suffix_Indication

Well, yes, we should find a way to separate out the parts of the name.
 In addition to facilitating abbreviation, it also facilitates
translation.  We also should include pronunciation information.

But first we need street relations.  It turns out some people already
did research into ways to structure a database which minimizes the
inconsistencies we currently have in the OSM database.  It's called
database normalization.

As it turns out, the current method of putting names on ways already
fails to even be in first normal form.  Some ways represent more than
one road.

The solution is to use relations.
http://wiki.openstreetmap.org/wiki/Relations/Proposed/Street is
somewhat of a good proposal, though I have a little bit of trouble
with the wording.  We shouldn't include "Any Tag that applies to all
parts of the road", but only to those tags which apply to the entire
road as a whole.  In other words, we'd include goods=no if there were
a law saying "no commercial vehicles are allowed on Whatever Parkway",
but not just because all the ways happen to have goods=no tags.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 12:33 PM, Nathan Edgars II  wrote:
> Ave is a bad example, since it is common to say 'Ave' rather than 'Avenue'.

That's what makes it a *good* example :).

> But I can't think of any other suffixes that are commonly abbreviated in
> speech. Maybe Cir?

If we're dealing with English (which not all US street names are in),
then "Ave" is the only major example I'm aware of.  It's also the only
example I'm aware of *at all*, but I haven't looked through any lists
beyond the most common English abbreviations.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz

At 2012-05-10 19:56, Anthony wrote:

On Thu, May 10, 2012 at 10:45 PM, Mike N  wrote:
>  But you wouldn't be confused if an stranger came in asking how to get to
> "Whatever Avenue"?If not, then there's no problem with the expansion.

Okay, so basically we're ignoring the on-the-ground rule in order to
map for the renderer.


Exactly :) Why that is ok, I don't know :(

--
Alan Mintz 


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Alan Mintz

At 2012-05-10 19:40, Anthony wrote:

On Thu, May 10, 2012 at 10:25 PM, Mike N  wrote:

>  The only question is what to do about those cases where it's only referred
> to locally as 'Ave', and the postal service would refuse letters addressed
> to 'Avenue'.

The postal service would refuse letters addressed to "Avenue" in some 
instances?


Unless this quote is out of context, that seems ridiculous (in the US).

--
Alan Mintz 


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread David ``Smith''
First, I apologize for posting my opinion without reading the entire
thread. It's a long thread, so I hope you understand.

Second, I'm okay with automated expansion of unambiguous road-type trailing
words.  Pk can be either park or pike, so leave that to local humans.
Leading road-type words like Avd for Avenida might also make sense, but I'm
not particularly familiar with how this looks in TIGER data.  This is
different from my earliest expressed opinion on the subject; the next point
is part of my revised position.

Third, I suggest retaining the abbreviated form in a tag like abbr_name.
Ideally, this should be the exact abbreviated form used on signs, if that's
consistent.  Getting this right requires local knowledge, but TIGER's
abbreviation might be better than nothing. I'm sure some will disagree with
that point.

Fourth, "Ave" isn't pronounced as-is everywhere. In the midwest we expand
it to Avenue in speech. (My cousin in the Boston area once mentioned "Mass
Ave" referring to Masachussetts Avenue, and to me it sounded like a single
word "Massaf"...)

Fifth, renderers must take care in abbreviating street names. For example,
Mapquest Open turns Lane Avenue into Ln Ave, where only the last word
should be abbreviated. To eliminate guesswork, renderers can use the
abbr_name tag, if present.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Nathan Edgars II

On 5/11/2012 9:45 AM, Anthony wrote:

On Thu, May 10, 2012 at 11:45 PM, Dale Puch  wrote:

Clarity!  The abbreviations are just that, they mean the full word, and are
spoken that way, but written and displayed as the abbreviation.  I also
disagree I have never know anyone that said "whatever A V E"  they do not
spell it out, they say the word the abbreviation stands for.  Same for St,
Dr ect.


What are you disagreeing with?  I've known streets that were called
"Whatever Ave" (Rhymes with "Whatever Have").  Not "Whatever Avenue".
And certainly not "Whatever A V E".


Ave is a bad example, since it is common to say 'Ave' rather than 
'Avenue'. But I can't think of any other suffixes that are commonly 
abbreviated in speech. Maybe Cir?


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Minh Nguyen

On 2012-05-11 6:45 AM, Anthony wrote:

Not really.  Is "1515 South West Shore Boulevard, Tampa" abbreviated
"1515 S West Shore Blvd, Tampa", or is it abbreviated "1515 S W Shore
Blvd, Tampa"?  If you want the answer, ask usps.com.

The only way to capture the full information is to have additional
tags telling you what the base is.  And if you do that, abbreviating
or not abbreviating doesn't matter.


That's similar to how the tiger:* tags are structured, and it's the 
subject of a proposal on the wiki:


http://wiki.openstreetmap.org/wiki/Proposed_features/Directional_Prefix_%26_Suffix_Indication

--
Minh Nguyen 
Jabber: m...@1ec5.org; Blog: http://notes.1ec5.org/




___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Fri, May 11, 2012 at 9:45 AM, Anthony  wrote:
> The only way to capture the full information is to have additional
> tags telling you what the base is.  And if you do that, abbreviating
> or not abbreviating doesn't matter.

And if you want to avoid tremendous redundancy, the way to that is
with some sort of street relations.

Each way should contain information about the way, the whole way, and
*nothing but the way*.  Including base_name information in every
instance of the way fails 3NF.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Ian Dees
On Fri, May 11, 2012 at 8:44 AM, Kristian M Zoerhoff wrote:

> On Fri, May 11, 2012 at 04:47:37AM -0400, Serge Wroclawski wrote:
> > I've added direction expansion into a new version, and thrown it up as a
> gist:
> >
> > https://gist.github.com/2656735
> >
> >
> > I don't treat direction prefixes and suffixes any differently- I
> > haven't seen an example where there is both a prefix and a suffix in
> > the name, and they're the same as the suffix.
>
> You might want to check Minneapolis/St Paul. They have some really bizarre
> directional combinations that could give you heartburn.


I live/lived there and will definitely be checking on what this script
does. Serge is aware of such road name oddities because it's pretty weird
where he lives, too.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Anthony
On Thu, May 10, 2012 at 11:45 PM, Dale Puch  wrote:
> Clarity!  The abbreviations are just that, they mean the full word, and are
> spoken that way, but written and displayed as the abbreviation.  I also
> disagree I have never know anyone that said "whatever A V E"  they do not
> spell it out, they say the word the abbreviation stands for.  Same for St,
> Dr ect.

What are you disagreeing with?  I've known streets that were called
"Whatever Ave" (Rhymes with "Whatever Have").  Not "Whatever Avenue".
And certainly not "Whatever A V E".

> It is a LOT easier to abbreviate from the full word than to go the other
> way.

Not really.  Is "1515 South West Shore Boulevard, Tampa" abbreviated
"1515 S West Shore Blvd, Tampa", or is it abbreviated "1515 S W Shore
Blvd, Tampa"?  If you want the answer, ask usps.com.

The only way to capture the full information is to have additional
tags telling you what the base is.  And if you do that, abbreviating
or not abbreviating doesn't matter.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Kristian M Zoerhoff
On Fri, May 11, 2012 at 04:47:37AM -0400, Serge Wroclawski wrote:
> I've added direction expansion into a new version, and thrown it up as a gist:
> 
> https://gist.github.com/2656735
> 
> 
> I don't treat direction prefixes and suffixes any differently- I
> haven't seen an example where there is both a prefix and a suffix in
> the name, and they're the same as the suffix.

You might want to check Minneapolis/St Paul. They have some really bizarre 
directional combinations that could give you heartburn.
 
-- 

Kristian M Zoerhoff


pgpZwZcR4bgpU.pgp
Description: PGP signature
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-11 Thread Serge Wroclawski
I've added direction expansion into a new version, and thrown it up as a gist:

https://gist.github.com/2656735


I don't treat direction prefixes and suffixes any differently- I
haven't seen an example where there is both a prefix and a suffix in
the name, and they're the same as the suffix.

The next version I'll make will collect and print out statistics, so
we'll be able to see how often it encounters these odd edge cases in
reality.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Toby Murray
On Thu, May 10, 2012 at 10:52 PM, Dale Puch  wrote:
> I think I came up with a rare possibility for error.
> The original "st something st" was manually expanded to "st something
> street"  your checking for a single st, and there would be.  Or am I missing
> another check?

It checks for one and ONLY one possible abbreviation to expand. If
there are more than one it punts and ignores the way. This is a very
conservative approach which is probably good at least for a first
pass. Maybe if the first run goes well we can see how many problems
are left and look at refining things for a second pass to catch more
difficult ones. Or not...

Toby

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
I think I came up with a rare possibility for error.
The original "st something st" was manually expanded to "st something
street"  your checking for a single st, and there would be.  Or am I
missing another check?  I can't think of any other situations besides Saint
and Street like this.  Possibly check it is at the end of the name or only
followed by a N S E ect.

Or just save those for a second pass at expanding either by another script
or by hand.

On Thu, May 10, 2012 at 4:08 PM, Serge Wroclawski  wrote:

> I've been testing a script to do this.
>
> Here it is:
>
> http://www.emacsen.net/tiger.py
>
> It needs to be fed a file. I've been using the state files from geofabrik.
>
> the resulting files in expansions can then be fed to a script for upload.
>
> I welcome feedback on the script and the resulting output.
>
> - Serge
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
The issue with abbreviations is very muddy.  BUT it has been said many time
that we do not want to abbreviate where possible.  There are several
reasons.

   - Clarity!  The abbreviations are just that, they mean the full word,
   and are spoken that way, but written and displayed as the abbreviation.  I
   also disagree I have never know anyone that said "whatever A V E"  they do
   not spell it out, they say the word the abbreviation stands for.  Same for
   St, Dr ect.
   - It is a LOT easier to abbreviate from the full word than to go the
   other way.  Otherwise this scripting expansion thing would be easy and
   error free.
   - As mentioned it makes use of the data easier, especially for
   searching, and text to speech.

Yes there can be errors with going from abbreviations to the full words.  A
reason for doing this as said, do it once with review instead of in every
program that uses the data.  But those errors are small in comparison to
the number of abbreviated way names and can be corrected later as found
just like any other tagging error.  Most of those errors are going to be on
names that are unclear to begin with.

People have gotten so gun shy of any automation or imports that I feel they
are actively blocking people trying to do the right thing and a good job.
It is almost to a point I wonder why someone would go thru talking it over
on the list if you get grief for it if you can just quietly start doing
it.  Obviously not what we want.

On Thu, May 10, 2012 at 10:56 PM, Anthony  wrote:

> On Thu, May 10, 2012 at 10:45 PM, Mike N  wrote:
> >  But you wouldn't be confused if an stranger came in asking how to get to
> > "Whatever Avenue"?If not, then there's no problem with the expansion.
>
> Okay, so basically we're ignoring the on-the-ground rule in order to
> map for the renderer.
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Anthony
On Thu, May 10, 2012 at 10:45 PM, Mike N  wrote:
>  But you wouldn't be confused if an stranger came in asking how to get to
> "Whatever Avenue"?    If not, then there's no problem with the expansion.

Okay, so basically we're ignoring the on-the-ground rule in order to
map for the renderer.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N

On 5/10/2012 10:40 PM, Anthony wrote:

Depends on what street you're talking about.  I've certainly lived in
places where the vast majority of the locals called it "Whatever Ave",
and not "Whatever Avenue".  Most of the US...wouldn't talk about the
street at all.


  But you wouldn't be confused if an stranger came in asking how to get 
to "Whatever Avenue"?If not, then there's no problem with the 
expansion.   Presumably, a US-centric renderer would abbreviate names 
for display, while spoken directions would be no more confusing than 
this stranger.




___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Anthony
On Thu, May 10, 2012 at 10:25 PM, Mike N  wrote:
> On 5/10/2012 10:19 PM, Anthony wrote:
>>
>> What I'm questioning is why it doesn't apply.  If the people call it
>> "Whatever Ave", shouldn't the data read "Whatever Ave"?
>
>
>  Most of the US wouldn't call it 'Whatever Ave'; when spoken, it would be
> 'Avenue'.  Having it expanded makes programs with spoken directions much
> more accurate.

Depends on what street you're talking about.  I've certainly lived in
places where the vast majority of the locals called it "Whatever Ave",
and not "Whatever Avenue".  Most of the US...wouldn't talk about the
street at all.

>  The only question is what to do about those cases where it's only referred
> to locally as 'Ave', and the postal service would refuse letters addressed
> to 'Avenue'.

The postal service would refuse letters addressed to "Avenue" in some instances?

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N

On 5/10/2012 10:19 PM, Anthony wrote:

What I'm questioning is why it doesn't apply.  If the people call it
"Whatever Ave", shouldn't the data read "Whatever Ave"?


 Most of the US wouldn't call it 'Whatever Ave'; when spoken, it would 
be 'Avenue'.  Having it expanded makes programs with spoken directions 
much more accurate.


  The only question is what to do about those cases where it's only 
referred to locally as 'Ave', and the postal service would refuse 
letters addressed to 'Avenue'.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Anthony
On Thu, May 10, 2012 at 9:52 PM, Mike N  wrote:
> On 5/10/2012 9:48 PM, Anthony wrote:
>>
>> You seem to be assuming all the changes are positive.
>
>  I didn't take it that way - it was just a quick test for orders of
> magnitude.

Followed by a comment of "Anyone arguing that not scripting these
changes should spend a day or two trying to do that by hand and get
back to us how they feel afterwards."

>> What happened to the "on the ground" rule, anyway?
>
>  That already doesn't directly apply because most street signs are
> abbreviated to start with.

What I'm questioning is why it doesn't apply.  If the people call it
"Whatever Ave", shouldn't the data read "Whatever Ave"?

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N

On 5/10/2012 9:48 PM, Anthony wrote:

You seem to be assuming all the changes are positive.


  I didn't take it that way - it was just a quick test for orders of 
magnitude.   An actual script takes more review.



What happened to the "on the ground" rule, anyway?


  That already doesn't directly apply because most street signs are 
abbreviated to start with.   Local and regional knowledge will be 
helpful though.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Anthony
On Thu, May 10, 2012 at 3:28 PM, Dale Puch  wrote:
> As a quick and dirty test I took Florida and Illinois road data from
> cloudmade.  A simple replace of the top 7 or so suffixes at the end of the
> name an with a space in front of it resulted in over 700,000 name changes
> for those 2 states alone, and that did not include all the names with
> cardinals (prefix and suffix) that need expanding.  It was well over 80% of
> the names.  Anyone arguing that not scripting these changes should spend a
> day or two trying to do that by hand and get back to us how they feel
> afterwards.

You seem to be assuming all the changes are positive.

What happened to the "on the ground" rule, anyway?

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
Sorry, I should have been clearer, the results I posted were from my quick
test.  I just wanted to report the abbreviations I saw as possible
additions to the list in Serge's script.  And to give an idea of which
showed up most either for scripting or if someone wanted to handle the
lesser used ones manually.

On Thu, May 10, 2012 at 6:25 PM, Serge Wroclawski  wrote:

> On Thu, May 10, 2012 at 6:08 PM, Mike N  wrote:
> > On 5/10/2012 4:08 PM, Serge Wroclawski wrote:
> >>
> >> I've been testing a script to do this.
> >>
> >> Here it is:
> >
> >
> >  Thanks for posting it.   I don't see where it expands directionals; I
> don't
> > see the same thing Dale saw:
> >  "5141 instances of E changed to East" for example.
>
> It doesn't expand directionals, and it only touches ways with tiger
> tags, so I suspect Dale is looking at the wrong code, or the wrong
> output, or something else.
>
> >  If it doesn't expand directionals, I believe that it should where the
> TIGER
> > hint is available "tiger:name_direction_prefix".   Otherwise we'll still
> end
> > up with endless nagging over
>
> Okay, let's talk about this then. It was originally outside the scope
> of our discussion, but I'm happy to add it- it won't be more than a
> another few lines of code and another lookup table.
>
> >   Warning - abbreviation in 'E Pond Scum Street'
> >
> >  when uploading.
>
> Please do not upload the output!
>
> 1. We should all agree on the correct output.
> 2. We should organize the right account to upload with.
> 3. We should add tags to the uploaded changesets.
>
> - Serge
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Serge Wroclawski
On Thu, May 10, 2012 at 6:08 PM, Mike N  wrote:
> On 5/10/2012 4:08 PM, Serge Wroclawski wrote:
>>
>> I've been testing a script to do this.
>>
>> Here it is:
>
>
>  Thanks for posting it.   I don't see where it expands directionals; I don't
> see the same thing Dale saw:
>  "5141 instances of E changed to East" for example.

It doesn't expand directionals, and it only touches ways with tiger
tags, so I suspect Dale is looking at the wrong code, or the wrong
output, or something else.

>  If it doesn't expand directionals, I believe that it should where the TIGER
> hint is available "tiger:name_direction_prefix".   Otherwise we'll still end
> up with endless nagging over

Okay, let's talk about this then. It was originally outside the scope
of our discussion, but I'm happy to add it- it won't be more than a
another few lines of code and another lookup table.

>   Warning - abbreviation in 'E Pond Scum Street'
>
>  when uploading.

Please do not upload the output!

1. We should all agree on the correct output.
2. We should organize the right account to upload with.
3. We should add tags to the uploaded changesets.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Ian Dees
On Thu, May 10, 2012 at 5:08 PM, Mike N  wrote:

> On 5/10/2012 4:08 PM, Serge Wroclawski wrote:
>
>> I've been testing a script to do this.
>>
>> Here it is:
>>
>
>  Thanks for posting it.   I don't see where it expands directionals; I
> don't see the same thing Dale saw:
>  "5141 instances of E changed to East" for example.
>
>  If it doesn't expand directionals, I believe that it should where the
> TIGER hint is available "tiger:name_direction_prefix".   Otherwise we'll
> still end up with endless nagging over
>
>   Warning - abbreviation in 'E Pond Scum Street'
>
>  when uploading.
>

This should probably happen at the same time -- maybe Serge wouldn't mind
letting us put this in GitHub so we can collaborate on fixes like this?
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Nathan Edgars II

On 5/10/2012 4:08 PM, Serge Wroclawski wrote:

I've been testing a script to do this.

Here it is:

http://www.emacsen.net/tiger.py

It needs to be fed a file. I've been using the state files from geofabrik.

the resulting files in expansions can then be fed to a script for upload.

I welcome feedback on the script and the resulting output.


You could try running it on Orange County, FL, where I've expanded 
suffixes with a combination of Overpass API, JOSM, and Textpad. Any 
changes are either prefixes, ones I missed, or false positives.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Mike N

On 5/10/2012 4:08 PM, Serge Wroclawski wrote:

I've been testing a script to do this.

Here it is:


 Thanks for posting it.   I don't see where it expands directionals; I 
don't see the same thing Dale saw:

 "5141 instances of E changed to East" for example.

  If it doesn't expand directionals, I believe that it should where the 
TIGER hint is available "tiger:name_direction_prefix".   Otherwise we'll 
still end up with endless nagging over


   Warning - abbreviation in 'E Pond Scum Street'

  when uploading.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
I am not that proficient with python, so if I misunderstand any code...

For reference here is the tally of what I got.  This was all ways, not just
tiger.

Florida
> 
> 138 instances of Aly changed to Alley
> 54091 instances of Ave changed to Avenue
> 11671 instances of Blvd changed to Boulevard
> 22 instances of Byp changed to Bypass
> 176 instances of Cswy changed to Causeway
> 16 instances of Ctr changed to Center
> 14300 instances of Cir changed to Circle
> 18 instances of Crt changed to Court
> 40543 instances of Ct changed to Court
> 773 instances of Cv changed to Cove
> 93 instances of Cres changed to Crescent
> 92 instances of Xing changed to Crossing
> 58322 instances of Dr changed to Drive
> 5141 instances of E changed to East
> 78 instances of Expy changed to Expressway
> 10 instances of Fwy changed to Freeway
> 137 instances of Gln changed to Glen
> 1268 instances of Hwy changed to Highway
> 36587 instances of Ln changed to Lane
> 13 instances of Lp changed to Loop
> 10 instances of Mnr changed to Manor
> 7975 instances of N changed to North
> 2475 instances of NE changed to Northeast
> 2102 instances of NW changed to Northwest
> 12 instances of Pk changed to Park
> 36 instances of Pkwy changed to Parkway
> 1493 instances of Pky changed to Parkway
> 2 instances of Pwy changed to Parkway
> 22 instances of Ps changed to Pass
> 16590 instances of Pl changed to Place
> 138 instances of Plz changed to Plaza
> 3 instances of Pnt changed to Point
> 6 instances of Pt changed to Point
> 51083 instances of Rd changed to Road
> 6 instances of Rte changed to Route
> 4801 instances of S changed to South
> 1900 instances of SE changed to Southeast
> 2456 instances of SW changed to Southwest
> 183 instances of Sq changed to Square
> 69116 instances of St changed to Street
> 15714 instances of Ter changed to Terrace
> 14 instances of Terr changed to Terrace
> 1 instances of Thfr changed to Thoroughfare
> 399 instances of Tr changed to Track
> 16 instances of Trk changed to Track
> 4227 instances of Trl changed to Trail
> 6 instances of Tpke changed to Turnpike
> 2 instances of Vlls changed to Villas
> 4849 instances of W changed to West
>

Illinois
> 
>
120 instances of Aly changed to Alley
> 25 instances of Av changed to Avenue
> 40563 instances of Ave changed to Avenue
> 2143 instances of Blvd changed to Boulevard
> 12 instances of Byp changed to Bypass
> 52 instances of Ctr changed to Center
> 3316 instances of Cir changed to Circle
> 6 instances of Cl changed to Close
> 28478 instances of Ct changed to Court
> 164 instances of Cv changed to Cove
> 16 instances of Cres changed to Crescent
> 3 instances of Crst changed to Crest
> 93 instances of Xing changed to Crossing
> 47796 instances of Dr changed to Drive
> 5812 instances of E changed to East
> 85 instances of Expy changed to Expressway
> 1 instances of Gdns changed to Gardens
> 1 instances of Grn changed to Green
> 1 instances of Hd changed to Head
> 3 instances of Hgts changed to Heights
> 13 instances of Hts changed to Heights
> 304 instances of Hwy changed to Highway
> 5 instances of Hls changed to Hills
> 1 instances of Intl changed to International
> 28315 instances of Ln changed to Lane
> 17 instances of Mnr changed to Manor
> 1 instances of Mdws changed to Meadows
> 6 instances of Mtwy changed to Motorway
> 6046 instances of N changed to North
> 39 instances of NE changed to Northeast
> 42 instances of NW changed to Northwest
> 10 instances of Pk changed to Park
> 109 instances of Pkwy changed to Parkway
> 775 instances of Pky changed to Parkway
> 6882 instances of Pl changed to Place
> 114 instances of Plz changed to Plaza
> 48777 instances of Rd changed to Road
> 1 instances of Rdwy changed to Roadway
> 13 instances of Rte changed to Route
> 972 instances of S changed to South
> 77 instances of SE changed to Southeast
> 112 instances of SW changed to Southwest
> 211 instances of Sq changed to Square
> 79813 instances of St changed to Street
> 1030 instances of Ter changed to Terrace
> 5 instances of Terr changed to Terrace
> 75 instances of Tr changed to Track
> 2133 instances of Trl changed to Trail
> 1460 instances of W changed to West
>




On Thu, May 10, 2012 at 4:08 PM, Serge Wroclawski  wrote:

> I've been testing a script to do this.
>
> Here it is:
>
> http://www.emacsen.net/tiger.py
>
> It needs to be fed a file. I've been using the state files from geofabrik.
>
> the resulting files in expansions can then be fed to a script for upload.
>
> I welcome feedback on the script and the resulting output.
>
> - Serge
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Serge Wroclawski
I've been testing a script to do this.

Here it is:

http://www.emacsen.net/tiger.py

It needs to be fed a file. I've been using the state files from geofabrik.

the resulting files in expansions can then be fed to a script for upload.

I welcome feedback on the script and the resulting output.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread Dale Puch
As a quick and dirty test I took Florida and Illinois road data from
cloudmade.  A simple replace of the top 7 or so suffixes at the end of the
name an with a space in front of it resulted in over 700,000 name changes
for those 2 states alone, and that did not include all the names with
cardinals (prefix and suffix) that need expanding.  It was well over 80% of
the names.  Anyone arguing that not scripting these changes should spend a
day or two trying to do that by hand and get back to us how they feel
afterwards.



On Thu, May 10, 2012 at 2:09 PM, stevea  wrote:

> I support this methodology in the sense of it being "Vet, then set." (Vet
> being a verb which my dictionary says means "make a careful and critical
> examination of something.")
>
> Sure, saying "reasonably simple grep search and replace" is a bit vague,
> but I'm not talking about the specifics of this, or any one particular,
> search, just that doing it to an offline copy and then vetting the results
> (having our community "discuss, agree, disagree, improve and finalize")
> sounds like more of the sort of "community consensus workflow steps" that I
> know are going to produce both harmony and great results.
>
> THEN upload (set).
>
> Does this mean I suggest precluding individual edit contributions that
> have not been more-widely vetted?  Of course not:  we do this all the time.
>  But as individuals, we just do it on the small scale. It is when we do it
> on the large scale (as in massive TIGER search and replaces) that I'm
> saying "Vet, then set" should be done.
>
> This project, its data, and its interaction amongst us as individual
> contributors in achieving harmonious consensus can only get better. We do a
> fair-to-good job now, let's make that "largely a great job" more so in the
> future.
>
> SteveA
> California
>
>
>
>  The error rate is directly related to how much testing and review is
>> done.  1/1,000 is by no means a set error rate for either manual or bot
>> edits.
>>
>> Reasonably simple grep search and replace will correctly expand the
>> example.  The default should and can be to not expand unless it meets
>> specific requirements.
>> Dr is only expanded to drive if it is at the end of the name, or second
>> to end and followed by cardinal directions (S, E, W, N ect.) but left alone
>> (or set to doctor) if nothing is in front of it.  Let the bot get the easy
>> stuff, and then report on the unknowns for manual edits.
>>
>> Run the grep on a copy of the DB, and do reports on the changes. Review
>> just the changed street names before and after for quality control.  Let
>> others review it as well.  Once it is ironed out make the changes in the
>> live DB.  I would guess the error rate after that would be well over
>> 1/1,000,000.
>>
>> Either way you can get an idea about the edits without doing anything to
>> the live database.
>> Dale Puch
>>
>
>


-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-10 Thread stevea
I support this methodology in the sense of it being "Vet, then set." 
(Vet being a verb which my dictionary says means "make a careful and 
critical examination of something.")


Sure, saying "reasonably simple grep search and replace" is a bit 
vague, but I'm not talking about the specifics of this, or any one 
particular, search, just that doing it to an offline copy and then 
vetting the results (having our community "discuss, agree, disagree, 
improve and finalize") sounds like more of the sort of "community 
consensus workflow steps" that I know are going to produce both 
harmony and great results.


THEN upload (set).

Does this mean I suggest precluding individual edit contributions 
that have not been more-widely vetted?  Of course not:  we do this 
all the time.  But as individuals, we just do it on the small scale. 
It is when we do it on the large scale (as in massive TIGER search 
and replaces) that I'm saying "Vet, then set" should be done.


This project, its data, and its interaction amongst us as individual 
contributors in achieving harmonious consensus can only get better. 
We do a fair-to-good job now, let's make that "largely a great job" 
more so in the future.


SteveA
California



The error rate is directly related to how much testing and review is 
done.  1/1,000 is by no means a set error rate for either manual or 
bot edits.


Reasonably simple grep search and replace will correctly expand the 
example.  The default should and can be to not expand unless it 
meets specific requirements.
Dr is only expanded to drive if it is at the end of the name, or 
second to end and followed by cardinal directions (S, E, W, N ect.) 
but left alone (or set to doctor) if nothing is in front of it.  Let 
the bot get the easy stuff, and then report on the unknowns for 
manual edits.


Run the grep on a copy of the DB, and do reports on the changes. 
Review just the changed street names before and after for quality 
control.  Let others review it as well.  Once it is ironed out make 
the changes in the live DB.  I would guess the error rate after that 
would be well over 1/1,000,000.


Either way you can get an idea about the edits without doing 
anything to the live database.

Dale Puch



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-09 Thread Dale Puch
The error rate is directly related to how much testing and review is done.
1/1,000 is by no means a set error rate for either manual or bot edits.

Reasonably simple grep search and replace will correctly expand the
example.  The default should and can be to not expand unless it meets
specific requirements.
Dr is only expanded to drive if it is at the end of the name, or second to
end and followed by cardinal directions (S, E, W, N ect.) but left alone
(or set to doctor) if nothing is in front of it.  Let the bot get the easy
stuff, and then report on the unknowns for manual edits.

Run the grep on a copy of the DB, and do reports on the changes.  Review
just the changed street names before and after for quality control.  Let
others review it as well.  Once it is ironed out make the changes in the
live DB.  I would guess the error rate after that would be well over
1/1,000,000.

Either way you can get an idea about the edits without doing anything to
the live database.

On Tue, May 8, 2012 at 11:34 PM, Anthony  wrote:

> On Tue, May 8, 2012 at 11:31 PM, Anthony  wrote:
> > "Doctor Martin Luther King Bolevard" is one thing.  "Drive Martin Luther
> King
> > Boulevard" is another.
>
> And if we're going to make so many mistakes (1/1000 means thousands of
> mistakes), I'd rather it just be left as "Dr Martin Luther King Blvd".
>
> Yes, we can't stop people from making mistakes.  But we can refuse to
> allow thousands of mistakes to be added, for the sake of removing
> abbreviations which aren't hurting anyone in the first place.
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>



-- 
Dale Puch
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-08 Thread Anthony
On Tue, May 8, 2012 at 11:31 PM, Anthony  wrote:
> "Doctor Martin Luther King Bolevard" is one thing.  "Drive Martin Luther King
> Boulevard" is another.

And if we're going to make so many mistakes (1/1000 means thousands of
mistakes), I'd rather it just be left as "Dr Martin Luther King Blvd".

Yes, we can't stop people from making mistakes.  But we can refuse to
allow thousands of mistakes to be added, for the sake of removing
abbreviations which aren't hurting anyone in the first place.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-08 Thread Anthony
On Wed, May 2, 2012 at 12:01 PM, Serge Wroclawski  wrote:
> 2) My human error rate estimation of 1/1000 seems entirely reasonable.
> Think typos, or misreading. I'm sure we see error rates that high now
> in OSM and we find them acceptable. A computer that's acting
> conservatively will actually produce far lower error rates!

But its error rates are potentially much more annoying.  "Doctor
Martin Luther King Bolevard" is one thing.  "Drive Martin Luther King
Boulevard" is another.  The latter will be much more difficult to find
at a later date and flag for review.

Maybe you won't make that particular error, but you're going to have
to be really careful to avoid making any errors like it.  And relying
on the TIGER tags may or may not help.  I wouldn't be surprised if
many of the TIGER tags themselves are screwed up, based on the kinds
of mistakes I've seen in TIGER data.

Anthony

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-08 Thread Anthony
On Wed, May 2, 2012 at 7:28 AM, Mike N  wrote:
> On 5/1/2012 11:49 PM, Anthony wrote:
>>>
>>>  That assumes that the TIGER tags will always be present to assist with
>>> >  proper automatic expansion.
>>
>> I'm not sure what you mean, because I am not making that assumption at
>> all.
>
>
>  You mentioned use of the history to access the TIGER tags.

Yes, I said if a bot is smart enough to go through the history tags,
then this *does* provide an advantage over data consumers doing it
themselves.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Toby Murray
On Wed, May 2, 2012 at 10:08 AM, Chris Lawrence  wrote:
> ISTM this might be a good "mechanical turk" application if there is
> genuine concern that there will be a substantial error rate (my
> point-of-view as a social scientist is that a hypothesized 1/1000
> error rate is pretty darn low, but I can appreciate that some might
> have more exacting standards), either implemented on the web or as a
> JOSM plugin.  Anything beats tedious, manual ad-hoc editing whenever
> there's a slight geometry change with the accompanying JOSM nags.

Well to put a scope on things, Ian and I were playing around with some
queries on IRC yesterday. Here is the result. This is the breakdown of
the values for the tiger:name_type tag for all version 1 ways. So most
of them will have last been touched by the original TIGER upload
although way splitting means there will be a few in here from other
users and may already be un-abbreviated. But I expect that to be a
pretty small fraction.

http://pastebin.com/LWyejSMr

Obviously the values at the bottom are all silly. Probably the result
of merging ways. This isn't going to help anyone and definitely needs
manual cleanup: Ln; Rd; Dr; Dr

But the top few values are obviously what we would want to focus on
with this bot.

Toby

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Serge Wroclawski
On Wed, May 2, 2012 at 11:08 AM, Chris Lawrence  wrote:
> ISTM this might be a good "mechanical turk" application if there is
> genuine concern that there will be a substantial error rate (my
> point-of-view as a social scientist is that a hypothesized 1/1000
> error rate is pretty darn low, but I can appreciate that some might
> have more exacting standards), either implemented on the web or as a
> JOSM plugin.

I'm already working on a revised script, but;

1) We're not talking about a small number of ways- we're talking about
over a million ways. if we assume it takes 20 seconds per way to
correct (which I think is actually low when you add in factors like
upload times) then it will over five and a half thousand man hours.
This would be a very large undertaking

2) My human error rate estimation of 1/1000 seems entirely reasonable.
Think typos, or misreading. I'm sure we see error rates that high now
in OSM and we find them acceptable. A computer that's acting
conservatively will actually produce far lower error rates!

3) I'm seeing very little resistance to the idea of an expansion
script on this list. There's pretty much universal support for
expansions, especially since it's half done already. The concern seems
to be about the script and error rates. We can (and should) test that-
I suspect we'll find very low errors rates- and we can correct the
errors, either in the script or if they're one-offs, in a post-script
process.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Chris Lawrence
ISTM this might be a good "mechanical turk" application if there is
genuine concern that there will be a substantial error rate (my
point-of-view as a social scientist is that a hypothesized 1/1000
error rate is pretty darn low, but I can appreciate that some might
have more exacting standards), either implemented on the web or as a
JOSM plugin.  Anything beats tedious, manual ad-hoc editing whenever
there's a slight geometry change with the accompanying JOSM nags.


Chris

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Mike N

On 5/2/2012 4:21 AM, Richard Finegold wrote:

balrog-kun's bot didn't limit itself to highway=* ways; some buildings
were changed from "E" to "East", "N" to "North", for example. I don't
know if that bug remains.


 It has long been fixed, from my reading of the script.   Of course 
others may review it before it is applied again.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Mike N

On 5/1/2012 11:49 PM, Anthony wrote:

  That assumes that the TIGER tags will always be present to assist with
>  proper automatic expansion.

I'm not sure what you mean, because I am not making that assumption at all.


  You mentioned use of the history to access the TIGER tags.   If that 
was your assumption, it is also not a good solution.  It is an arbitrary 
barrier to data consumers.  They already need a  128GB RAM, 
Multi-terabyte SSD array to create a tile server.  Now they need to add 
to that setup a multi-terabyte Planet-History database store and processing?


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-02 Thread Richard Finegold
On Tue, May 1, 2012 at 9:14 AM, Mike N  wrote:
>  I think the one that Andrzej / balrog-kun was running makes the best use of
> hints in the TIGER data.
>
> https://trac.openstreetmap.org/browser/applications/utils/import/tiger2osm/expand/expand.py
>
>  It's notable that even this script requires some significant manual labor /
> review associated with each batch processed.

balrog-kun's bot didn't limit itself to highway=* ways; some buildings
were changed from "E" to "East", "N" to "North", for example. I don't
know if that bug remains.
http://www.openstreetmap.org/browse/way/44797519/history

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:41 PM, Ian Dees  wrote:
> On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II 
> wrote:
>>
>> The TIGER tags are not exactly standard OSM tags that belong in the
>> database. Better that we get rid of them at the same time as we expand
>> abbreviations.
>
> Although the tiger:* keys aren't standard, the information they store is
> very useful. There are plenty of people that might want to know the
> different parts of a road name, so we should simply rename these tags
> instead of completely blowing the data away.

I guess that's okay too, though personally I get so annoyed by the
redundant data (*) that I couldn't be bothered.  Why street relations
never caught on is beyond me.

(*) I.E. adding base_name=Main to the 100 different ways that Main
Street is split up into.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:36 PM, Mike N  wrote:
> On 5/1/2012 1:21 PM, Anthony wrote:
>
>> The preprocessing step between downloading the data from OSM and doing
>> something with it.
>
>  That assumes that the TIGER tags will always be present to assist with
> proper automatic expansion.

I'm not sure what you mean, because I am not making that assumption at all.

>  And I'd rather have the US data in line with the world-wide OSM data where
> it makes sense.   That way the US can consume OSM US data with tools
> developed worldwide, without the tool writers needing to implement
> US-specific rules.
>
>  After analysis, most of the US opinions fall on the side of no
> abbreviations.

I don't think anyone in this thread is arguing against expanding
abbreviations.  The question is whether or not it's okay for a bot to
expand abbreviations.  And to a large extent that depends on how
accurate the bot will be.

If the bot is sure to be 100% accurate, then hey, no problem.  But I
don't believe that is the case.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Gregory Arenius
On Tue, May 1, 2012 at 10:41 AM, Ian Dees  wrote:

> On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II wrote:
>>
>> The TIGER tags are not exactly standard OSM tags that belong in the
>> database. Better that we get rid of them at the same time as we expand
>> abbreviations.
>
>
> Although the tiger:* keys aren't standard, the information they store is
> very useful. There are plenty of people that might want to know the
> different parts of a road name, so we should simply rename these tags
> instead of completely blowing the data away.
>
>
I agree with this completely.  The name tag should have the full street
name but having the name base and street type in there as well is great.


> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us
>
>
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Ian Dees
On Tue, May 1, 2012 at 12:26 PM, Nathan Edgars II wrote:
>
> The TIGER tags are not exactly standard OSM tags that belong in the
> database. Better that we get rid of them at the same time as we expand
> abbreviations.


Although the tiger:* keys aren't standard, the information they store is
very useful. There are plenty of people that might want to know the
different parts of a road name, so we should simply rename these tags
instead of completely blowing the data away.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Mike N

On 5/1/2012 1:21 PM, Anthony wrote:


The preprocessing step between downloading the data from OSM and doing
something with it.


  That assumes that the TIGER tags will always be present to assist 
with proper automatic expansion.


  And I'd rather have the US data in line with the world-wide OSM data 
where it makes sense.   That way the US can consume OSM US data with 
tools developed worldwide, without the tool writers needing to implement 
US-specific rules.


  After analysis, most of the US opinions fall on the side of no 
abbreviations.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:31 PM, Anthony  wrote:
> And actually, if the bot is going to be smart enough to look at the
> history, to find deleted TIGER tags, then maybe there is some
> advantage to doing this during the preprocessing step (which would
> often not have access to history data).

What I mean is that, if the bot is going to look at the history, then
there would be an advantage to letting the bot run.

But I am assuming this could be done with much less than a 1/1000
error rate.  1/10,000 would maybe be acceptable.  1/100,000 would be
okay.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:26 PM, Nathan Edgars II  wrote:
> On 5/1/2012 1:23 PM, Anthony wrote:
>>
>> On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars II
>>  wrote:
>>>
>>> On 5/1/2012 12:59 PM, Anthony wrote:


 Automatically expanding abbreviations is a terrible idea.  If an
 abbreviation is unambiguous, then it can be expanded during the
 preprocessing step.  If, on the other hand, it is ambiguous, then you
 are turning ambiguous data into incorrect data, which certainly
 diminishes the data.
>>>
>>>
>>>
>>> Not quite. We have various TIGER tags that break the name into pieces,
>>> and
>>> allow automated expansion where the name field may be ambiguous. (Though
>>> occasionally these tags are wrong.)
>>
>>
>> I'm not sure what you're disagreeing with.  Either it is unambiguous
>> (due to TIGER tags or whatever), and therefore can be done during the
>> preprocessing step.  Or it is ambiguous, and needs human
>> intervention/review.
>>
> The TIGER tags are not exactly standard OSM tags that belong in the
> database. Better that we get rid of them at the same time as we expand
> abbreviations.

On that point, I strongly agree.

And actually, if the bot is going to be smart enough to look at the
history, to find deleted TIGER tags, then maybe there is some
advantage to doing this during the preprocessing step (which would
often not have access to history data).

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Nathan Edgars II

On 5/1/2012 1:23 PM, Anthony wrote:

On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars II  wrote:

On 5/1/2012 12:59 PM, Anthony wrote:


Automatically expanding abbreviations is a terrible idea.  If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing step.  If, on the other hand, it is ambiguous, then you
are turning ambiguous data into incorrect data, which certainly
diminishes the data.



Not quite. We have various TIGER tags that break the name into pieces, and
allow automated expansion where the name field may be ambiguous. (Though
occasionally these tags are wrong.)


I'm not sure what you're disagreeing with.  Either it is unambiguous
(due to TIGER tags or whatever), and therefore can be done during the
preprocessing step.  Or it is ambiguous, and needs human
intervention/review.

The TIGER tags are not exactly standard OSM tags that belong in the 
database. Better that we get rid of them at the same time as we expand 
abbreviations.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:18 PM, Nathan Edgars II  wrote:
> On 5/1/2012 12:59 PM, Anthony wrote:
>>
>> Automatically expanding abbreviations is a terrible idea.  If an
>> abbreviation is unambiguous, then it can be expanded during the
>> preprocessing step.  If, on the other hand, it is ambiguous, then you
>> are turning ambiguous data into incorrect data, which certainly
>> diminishes the data.
>
>
> Not quite. We have various TIGER tags that break the name into pieces, and
> allow automated expansion where the name field may be ambiguous. (Though
> occasionally these tags are wrong.)

I'm not sure what you're disagreeing with.  Either it is unambiguous
(due to TIGER tags or whatever), and therefore can be done during the
preprocessing step.  Or it is ambiguous, and needs human
intervention/review.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:17 PM, Mike N  wrote:
> On 5/1/2012 12:59 PM, Anthony wrote:
>>
>> I'm not sure what you're saying.
>>
>> Automatically expanding abbreviations is a terrible idea.  If an
>> abbreviation is unambiguous, then it can be expanded during the
>> preprocessing step.  If, on the other hand, it is ambiguous, then you
>> are turning ambiguous data into incorrect data, which certainly
>> diminishes the data.
>
>
>  What preprocessing step?

The preprocessing step between downloading the data from OSM and doing
something with it.

What that is differs between applications.  In many applications it is
called osm2pgsql.

> The types
> of errors I'm referring to are where you go to upload from JOSM, then decide
> to slavishly submit to the validator's warnings about abbreviated street
> names.  What person manually types 2 - 3 dozen versions of Street , Avenue,
> Boulevard, Point, Circle without any typos?

Typos are much easier to fix than improper "abbreviation expansions".

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Nathan Edgars II

On 5/1/2012 12:59 PM, Anthony wrote:

Automatically expanding abbreviations is a terrible idea.  If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing step.  If, on the other hand, it is ambiguous, then you
are turning ambiguous data into incorrect data, which certainly
diminishes the data.


Not quite. We have various TIGER tags that break the name into pieces, 
and allow automated expansion where the name field may be ambiguous. 
(Though occasionally these tags are wrong.)


Not that it's that hard to do a small area (county-sized) using Overpass 
API, JOSM, and Textpad.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Mike N

On 5/1/2012 12:59 PM, Anthony wrote:

I'm not sure what you're saying.

Automatically expanding abbreviations is a terrible idea.  If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing step.  If, on the other hand, it is ambiguous, then you
are turning ambiguous data into incorrect data, which certainly
diminishes the data.


  What preprocessing step?  TIGER data has already been imported.  The 
types of errors I'm referring to are where you go to upload from JOSM, 
then decide to slavishly submit to the validator's warnings about 
abbreviated street names.  What person manually types 2 - 3 dozen 
versions of Street , Avenue, Boulevard, Point, Circle without any typos?


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Tue, May 1, 2012 at 1:06 PM, Serge Wroclawski  wrote:
> The other point that's being missed is that we as a community already
> accept an error rate in our data that's far larger than any potential
> mistake rate on a well written script. If the script makes one error
> in 1000 streets, it will be doing a better job than a vast majority of
> manual mappers, and like manual mappers, they can be corrected.

If someone manually expanded 1,000,000 street name abbreviations, and
made 1,000 mistakes, it would not be acceptable.

If they were doing something more useful than expanding street name
abbreviations, fine.  But expanding street name abbreviations,
according to a very simple heuristic which can easily be done at the
preprocessing stage, is not very useful.

If this is going to be done, I hope the error rate is much smaller
than 1 in 1,000.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Mon, Apr 30, 2012 at 11:28 PM, Serge Wroclawski  wrote:
> On Mon, Apr 30, 2012 at 8:14 PM, Paul Johnson  wrote:
>> There have been some limited automated expansions, though they can be
>> problematic, because abbreviations can mean many possible things.  Expanding
>> abbreviations requires a bit of a human touch.  Creating abbreviations in
>> the renderer when so desired, not so much.
>
> This is true, but if one is talking about the TIGER data, there are a
> number of hints that can make this problem virtually nil.
>
> There's a tag tiger:name_type key that contains the value of the
> expandable name section, eg. St or Ln or Pky. AFAIK these are always
> expandable to Street, Lane and Parkway.
>
> And of course one must only expand the name_tag value if it's the last
> component of the name string, eg. Ln Ln should be Ln Lane. This should
> be fairly easy to construct in a regex, but one should be careful of
> it.
>
> Those two rules should eliminate a vast majority of expansion issues.
> If we only expand TIGER data, then it should be a fairly
> straightforward process.
>
> Of course such a script should be peer reviewed and tested, but I'm
> confident that the error rate will be very low.

I guess this would be okay, so long as it gets peer reviewed and
tested by a group including you.

> And for those few exceptions where the expansion is wrong, a human
> review process will turn this up and make it fairly correctable. In
> fact, I'd argue that the problems won't be subtle, making them easy to
> spot and fix.

How would the human review process work?  Isn't it better to do the
review *before* editing the database?

> In return, we'll save hundreds, maybe thousands of man hours doing expansions.

Useless expansions, though.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Serge Wroclawski
On Tue, May 1, 2012 at 9:10 AM, Richard Weait  wrote:

> The previous bots were shouted down and all took the approach of
> finding things to change, then changing them.  This sounds like a
> similar approach.

That is not the case. In one of the bots, one error was found, and the
author decided not to continue. I don't think either of us knows about
the others.

> Is there any benefit to finding the subtle, problematic abbreviations
> and highlighting them for manual intervention?

We're not talking about all abbreviations, but rather just:

1. Abbreviations in the TIGER dataset (not ones entered manually).

2. A limited set of abbreviations set in the dataset, which discusses
road "type"- that is Road, Lane, Parkway, etc. We're not talking about
expanding other possible expansions or making other automated
corrections.

>  Sort of an error
> overlay in the OSMI style, that highlights future expansion problems?

There are things that computers are bad at that people are good at,
and there are things computers are good at which people are bad at.
Computers are good at making set changes to data. They're better and
faster at it than any human.

The other point that's being missed is that we as a community already
accept an error rate in our data that's far larger than any potential
mistake rate on a well written script. If the script makes one error
in 1000 streets, it will be doing a better job than a vast majority of
manual mappers, and like manual mappers, they can be corrected.

The only reason to introduce an overlay is when the problem is not
easily solvable by computer, then we introduce augmented tools. But
it's a waste of human effort to have people like me spending hours (as
I have done) going through and manually typing "Road" over ways.

Add to that that we're not talking about a small number of streets,
but in fact the vast majority (>99%) of streets in the densest part of
the US, the East Coast, moving toward the Center (since the West Coast
has already been fixed).

> Many contributors wrote:
>
> "Yay!  I can haz 'Bots pleez!?!?"  :-)

Richard, I think you're capable of making your point without being
demeaning, please prove me right in your next email.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Anthony
On Mon, Apr 30, 2012 at 10:35 PM, Mike N  wrote:
> On 4/30/2012 10:24 PM, Toby Murray wrote:
>>
>>  I believe It was stopped after some
>> complaints about it not handling some situations correctly. But I
>> would probably be in favor of trying to complete it.
>
>
>  I would agree - there's no point in asserting that we have to spend time
> manually expanding everything, it's not adding value to the map data.   And
> the bot is probably more accurate than a human, limited only by the accuracy
> of the base TIGER data - think of all the possible typos on streeet, avenve,
> and boulavard.

I'm not sure what you're saying.

Automatically expanding abbreviations is a terrible idea.  If an
abbreviation is unambiguous, then it can be expanded during the
preprocessing step.  If, on the other hand, it is ambiguous, then you
are turning ambiguous data into incorrect data, which certainly
diminishes the data.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Mike N

On 5/1/2012 9:10 AM, Richard Weait wrote:

The previous bots were shouted down and all took the approach of
finding things to change, then changing them.  This sounds like a
similar approach.


  I think the one that Andrzej / balrog-kun was running makes the best 
use of hints in the TIGER data.


https://trac.openstreetmap.org/browser/applications/utils/import/tiger2osm/expand/expand.py

  It's notable that even this script requires some significant manual 
labor / review associated with each batch processed.



Is there any benefit to finding the subtle, problematic abbreviations
and highlighting them for manual intervention?  Sort of an error
overlay in the OSMI style, that highlights future expansion problems?
If we find and fix the problem cases first, surely fixing the last rd
-->  Road batch will be easier and less error prone.


  I don't know that it needs to be done ahead of time.   These cases 
are very few and far between.   And how does automatic analysis find 
that the streets

   St MikeN Street
   Saint MikeN Street

   are really 2 different streets, and must retain their spelling? 
But it would still be good to have a tag that identifies those few cases 
so that they remain untouched by any future bot activities.  In some 
cases, new contributors will change the full name back to abbreviation 
when editing because that's what they're used to seeing.



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-05-01 Thread Richard Weait
Many contributors wrote:

"Yay!  I can haz 'Bots pleez!?!?"  :-)

Dekkert replied, " 'Bots are like any other machine - they're either a
benefit or a hazard. If they're a benefit, it's not my problem."  Then
he resignedly took a swig from his futuristic, Los Angeles 2023, drink
and got into his flying car.

Serge added:
> And for those few exceptions where the expansion is wrong, a human
> review process will turn this up and make it fairly correctable. In
> fact, I'd argue that the problems won't be subtle, making them easy to
> spot and fix.
>
> In return, we'll save hundreds, maybe thousands of man hours doing expansions.

The previous bots were shouted down and all took the approach of
finding things to change, then changing them.  This sounds like a
similar approach.

Is there any benefit to finding the subtle, problematic abbreviations
and highlighting them for manual intervention?  Sort of an error
overlay in the OSMI style, that highlights future expansion problems?
If we find and fix the problem cases first, surely fixing the last rd
--> Road batch will be easier and less error prone.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Serge Wroclawski
On Mon, Apr 30, 2012 at 8:14 PM, Paul Johnson  wrote:
>
> On Apr 30, 2012 5:00 PM, "David Litke"  wrote:
>>
>> I just did a few manual TIGER reviews in JOSM and got a validation warning
>> that words like Street and Avenue were abbreviated as St and Ave. So I
>> wonder if this is considered something that needs to be fixed?
>
> Yes.  Rule one of abbreviations:  Don't do it!
>
>> If so, shouldn't it be easy to somehow do a batch global update?
>
> There have been some limited automated expansions, though they can be
> problematic, because abbreviations can mean many possible things.  Expanding
> abbreviations requires a bit of a human touch.  Creating abbreviations in
> the renderer when so desired, not so much.

This is true, but if one is talking about the TIGER data, there are a
number of hints that can make this problem virtually nil.

There's a tag tiger:name_type key that contains the value of the
expandable name section, eg. St or Ln or Pky. AFAIK these are always
expandable to Street, Lane and Parkway.

And of course one must only expand the name_tag value if it's the last
component of the name string, eg. Ln Ln should be Ln Lane. This should
be fairly easy to construct in a regex, but one should be careful of
it.

Those two rules should eliminate a vast majority of expansion issues.
If we only expand TIGER data, then it should be a fairly
straightforward process.

Of course such a script should be peer reviewed and tested, but I'm
confident that the error rate will be very low.

And for those few exceptions where the expansion is wrong, a human
review process will turn this up and make it fairly correctable. In
fact, I'd argue that the problems won't be subtle, making them easy to
spot and fix.

In return, we'll save hundreds, maybe thousands of man hours doing expansions.

- Serge

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Richard Welty

On 4/30/12 10:35 PM, Mike N wrote:

On 4/30/2012 10:24 PM, Toby Murray wrote:

  I believe It was stopped after some
complaints about it not handling some situations correctly. But I
would probably be in favor of trying to complete it.


 I would agree - there's no point in asserting that we have to spend 
time manually expanding everything, it's not adding value to the map 
data.   And the bot is probably more accurate than a human, limited 
only by the accuracy of the base TIGER data - think of all the 
possible typos on streeet, avenve, and boulavard.
from what i gather, there is more than one expansion bot, and they're 
not all the same.
at least, i saw some incorrectly expanded names in North-Central Iowa 
last year, and everyone involved
in the bots that spoke up disclaimed knowledge of that particular 
example of bad expansions.


so there are bots and there are bots, and i'd feel happer about them if 
i sensed more devotion to
quality assurance. after all, if you don't expand automagically, you 
have unwanted abbreviations that may not
get expanded for years if you do expand, you may introduce errors in 
place of abbreviations that may go

undetected for years.

richard


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread John F. Eldredge
David Litke  wrote:

> I just did a few manual TIGER reviews in JOSM and got a validation
> warning that words like Street and Avenue were abbreviated as St and
> Ave. So I wonder if this is considered something that needs to be
> fixed? If so, shouldn't it be easy to somehow do a batch global
> update?
>

This has been discussed, and tried, before.  Unfortunately, some abbreviations 
can stand for more than one thing, and it takes local knowledge to be sure what 
is the right choice.

-- 
John F. Eldredge --  j...@jfeldredge.com
"Reserve your right to think, for even to think wrongly is better than not to 
think at all." -- Hypatia of Alexandria

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Mike N

On 4/30/2012 10:24 PM, Toby Murray wrote:

  I believe It was stopped after some
complaints about it not handling some situations correctly. But I
would probably be in favor of trying to complete it.


 I would agree - there's no point in asserting that we have to spend 
time manually expanding everything, it's not adding value to the map 
data.   And the bot is probably more accurate than a human, limited only 
by the accuracy of the base TIGER data - think of all the possible typos 
on streeet, avenve, and boulavard.


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Toby Murray
On Mon, Apr 30, 2012 at 7:14 PM, Paul Johnson  wrote:
>
> On Apr 30, 2012 5:00 PM, "David Litke"  wrote:
>>
>> I just did a few manual TIGER reviews in JOSM and got a validation warning
>> that words like Street and Avenue were abbreviated as St and Ave. So I
>> wonder if this is considered something that needs to be fixed?
>
> Yes.  Rule one of abbreviations:  Don't do it!
>
>> If so, shouldn't it be easy to somehow do a batch global update?
>
> There have been some limited automated expansions, though they can be
> problematic, because abbreviations can mean many possible things.  Expanding
> abbreviations requires a bit of a human touch.  Creating abbreviations in
> the renderer when so desired, not so much.


If by "limited" you mean "half the country" :)

Yes, there was a TIGER name expansion bot that ran from the west coast
to about the Mississippi. I believe It was stopped after some
complaints about it not handling some situations correctly. But I
would probably be in favor of trying to complete it.

Related: http://ksmapper.blogspot.com/2011/05/main-attraction.html

Toby

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread Paul Johnson
On Apr 30, 2012 5:00 PM, "David Litke"  wrote:
>
> I just did a few manual TIGER reviews in JOSM and got a validation
warning that words like Street and Avenue were abbreviated as St and Ave.
So I wonder if this is considered something that needs to be fixed?

Yes.  Rule one of abbreviations:  Don't do it!

> If so, shouldn't it be easy to somehow do a batch global update?

There have been some limited automated expansions, though they can be
problematic, because abbreviations can mean many possible things.
Expanding abbreviations requires a bit of a human touch.  Creating
abbreviations in the renderer when so desired, not so much.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


[Talk-us] Fixing TIGER street name abbreviations

2012-04-30 Thread David Litke
I just did a few manual TIGER reviews in JOSM and got a validation warning that 
words like Street and Avenue were abbreviated as St and Ave. So I wonder if 
this is considered something that needs to be fixed? If so, shouldn't it be 
easy to somehow do a batch global update?___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Dion Dock

> Message: 1
> Date: Tue, 07 Jul 2009 08:14:31 -0400
> From: Paul Fox 
> Subject: Re: [Talk-us] Fixing TIGER
> To: Talk-us 
> Message-ID: <7945.1246968...@foxharp.boston.ma.us>
> Content-Type: text/plain; charset="us-ascii"
>
> stevec wrote:
>> Hi
>>
>> Just want to highlight to people here some neat things to help fixing
>> TIGER. There's now a map layer highlighting what needs help:
>>
>>  http://www.opengeodata.org/?p=626
>>
>> as well as some HOWTOs on what needs fixing
>>
>>  http://wiki.openstreetmap.org/wiki/TIGER_fixup
>
> my pet goal is missing from that list (it's only sort of a part
> of "Wrong road classification"), and that is, missing "surface"
> keys, to indicate whether a road is paved or not.  this is a key
> piece of information for many kinds of navigation (bicycling,
> motorcycling, and for many others in the case of foul weather),
> and it's completely missing (or mostly incorrect) on all other
> free and non-free maps.

I agree; I'm not particularly interested in how rough a surface is,  
generally gravel vs paved vs dirt is good enough.  There are lots of  
roads that look appealing until you actually ride out three miles and  
then see they turn to gravel.  It's fine for cars but usually a non- 
starter for road bicycles.  Having five different smoothness ratings  
is too much detail.


>
> unfortunately, there's not much incentive to add the information
> to OSM, because none of the renderers will display it anyway.  :-)
>
> (i started a thread on this earlier this year, and then dropped
> the ball by not continuing to lobby for changes to the renderers
> to fix this problem.)
>
> paul
> =-
> paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 58.1  
> degrees)

-Dion


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Apollinaris Schoell
it's just not sufficient to have one tag to tell if data is good or  
not. with yahoo it's easy to check the worst errors but is that enough  
to say it's reviewed?
- is it tagged correct
- is it within an (undefined?) accuracy limit
- 
there isn't any agreement on other tags either, but thats osm :)





On 7 Jul 2009, at 19:37 , Dave Hansen wrote:

> On Tue, 2009-07-07 at 21:53 -0400, Mike N. wrote:
>> Keep in mind for the change detection algorithm, that a few users  
>> do not
>> realize that it is good to remove the tiger:reviewed=no after  
>> verifying or
>> correcting a way.
>
> Heh, that's why I didn't mention tiger:reviewed=no: it's not
> consistently useful.
>
> -- Dave
>


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Dave Hansen
On Tue, 2009-07-07 at 21:53 -0400, Mike N. wrote:
> Keep in mind for the change detection algorithm, that a few users do not 
> realize that it is good to remove the tiger:reviewed=no after verifying or 
> correcting a way.

Heh, that's why I didn't mention tiger:reviewed=no: it's not
consistently useful.

-- Dave


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Mike N.
Keep in mind for the change detection algorithm, that a few users do not 
realize that it is good to remove the tiger:reviewed=no after verifying or 
correcting a way.

--
> I saved all the original source data and the id mappings from the source
> data to the OSM database.
>
> For each of the source ways and nodes, I can look up the OSM version and
> download it.  Then, compare what I downloaded to what was originally
> uploaded.  If they're the same, add a tag, or just make a trivial update
> to the way/node to get it moved to another userid.
 


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Dave Hansen
On Tue, 2009-07-07 at 10:37 -0700, Apollinaris Schoell wrote:
> > It would probably be smart for any navigation system using OSM to  
> > check
> > for tiger:reviewed=no.  Perhaps I should go thorough and retag the  
> > TIGER
> > data which is so far been unmodified.
> 
> how would you do that? is there any info was dropped and can be used  
> to classify things better?

I saved all the original source data and the id mappings from the source
data to the OSM database.  

For each of the source ways and nodes, I can look up the OSM version and
download it.  Then, compare what I downloaded to what was originally
uploaded.  If they're the same, add a tag, or just make a trivial update
to the way/node to get it moved to another userid.

-- Dave


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Apollinaris Schoell
>
>
> If I had it to do all over again, I'd make a unique user-id for each
> upload instance.  The upload-uuid helps, but it's not volatile upon
> edits.
>
now it's in there and it's much better then nothing. redoing the whole  
thing isn't really an option.

> It would probably be smart for any navigation system using OSM to  
> check
> for tiger:reviewed=no.  Perhaps I should go thorough and retag the  
> TIGER
> data which is so far been unmodified.
>

how would you do that? is there any info was dropped and can be used  
to classify things better?

> -- Dave
>


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Paul Fox
apollinaris wrote:
 > > [ aside:  google streetview astonishes me sometimes. ]
 > 
 > they test their cameras under rough conditions ;-) but still their map  
 > doesn't render different either.

exactly.  this is a care where OSM could definitely one-up the
"competition".  no paper maps that i've seen get it right either.

paul
=-
 paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 62.2 degrees)

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Apollinaris Schoell
>
> you're referring to this?
>http://wiki.openstreetmap.org/wiki/Proposed_features/grade1-5

yes

>
> the last time i raised this i was told that "surface=" was the way
> it should be done.  in the cases i'm thinking of, tracktype doesn't
> seem appropriate -- i'm thinking of numbered routes whose surface  
> isn't
> paved -- i.e., public roadways.  like route 58 in vermont:
>

agreed, a public roadway is not a track. surface is the best we have.

>
> http://maps.google.com/maps?f=q&source=s_q&hl=en&q=Hazen+Notch+Rd,+Lowell,+Vermont+05847&sll=44.276917,-72.418957&sspn=0.078168,0.128918&ie=UTF8&cd=1&geocode=FVf0qwIdWuOt-w&split=0&ll=44.823612,-72.490883&spn=0.019359,0.032229&z=15&iwloc=A&layer=c&cbll=44.823619,-72.491039&panoid=s_6TgG1fewPjFSxdBFTA_A&cbp=12,267.24,,0,5
>
> [ aside:  google streetview astonishes me sometimes. ]

they test their cameras under rough conditions ;-) but still their map  
doesn't render different either.




___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Dave Hansen
On Tue, 2009-07-07 at 10:23 -0700, Apollinaris Schoell wrote:
> it's bad if you don't have a 4WD but your navi thinks it's perfect to  
> use :(
> but even tomtom, google, ... have the same errors in these areas.
> if we want osm to be better there is no easy way to get it right  
> without checking the reality.
> mass imports give a good starting point but it's really hard to remove  
> the old invalid data and there not many mappers in remote areas to  
> verify it.

Amen to that!

If I had it to do all over again, I'd make a unique user-id for each
upload instance.  The upload-uuid helps, but it's not volatile upon
edits.

It would probably be smart for any navigation system using OSM to check
for tiger:reviewed=no.  Perhaps I should go thorough and retag the TIGER
data which is so far been unmodified.

-- Dave


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Apollinaris Schoell
it's bad if you don't have a 4WD but your navi thinks it's perfect to  
use :(
but even tomtom, google, ... have the same errors in these areas.
if we want osm to be better there is no easy way to get it right  
without checking the reality.
mass imports give a good starting point but it's really hard to remove  
the old invalid data and there not many mappers in remote areas to  
verify it. yahoo images don't help either in forests. You can't tell  
the type of a track if it's barly visible because of the trees.

On 7 Jul 2009, at 9:10 , Dave Hansen wrote:

> On Tue, 2009-07-07 at 09:04 -0700, Apollinaris Schoell wrote:
>>  but agree tiger data is very bad in this regard. nearly all tracks
>> are defined as highway=residential.
>
> Well, you can call it good or bad, but the fact of the matter is that
> TIGER was never really intended to be a road map.  highway=residential
> was the fallback for things for which we didn't have a better type.
> They were all smaller roads and that seemed to make them render in the
> most reasonable way.  I think of it like meaning, "small unimportant
> road which doesn't matter unless you're close to it".  You'd never use
> these for long-distance travel, for instance.
>
> -- Dave
>


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Paul Fox
apollinaris wrote:
 > >
 > > my pet goal is missing from that list (it's only sort of a part
 > > of "Wrong road classification"), and that is, missing "surface"
 > > keys, to indicate whether a road is paved or not.  this is a key
 > > piece of information for many kinds of navigation (bicycling,
 > > motorcycling, and for many others in the case of foul weather),
 > > and it's completely missing (or mostly incorrect) on all other
 > > free and non-free maps.
 > >
 > 
 > this is not tiger specific. you can use tags like track_type,  
 > surface  or come up with better ones.
 >   but agree tiger data is very bad in this regard. nearly all tracks  
 > are defined as highway=residential. even the ones which are 4WD roads  
 > in the USGS maps.
 > 
 > > unfortunately, there's not much incentive to add the information
 > > to OSM, because none of the renderers will display it anyway.  :-)
 > >
 > > (i started a thread on this earlier this year, and then dropped
 > > the ball by not continuing to lobby for changes to the renderers
 > > to fix this problem.)
 > 
 > use track_type and this is rendered different in osmarender. for me it  

you're referring to this?
http://wiki.openstreetmap.org/wiki/Proposed_features/grade1-5

the last time i raised this i was told that "surface=" was the way
it should be done.  in the cases i'm thinking of, tracktype doesn't
seem appropriate -- i'm thinking of numbered routes whose surface isn't
paved -- i.e., public roadways.  like route 58 in vermont:


http://maps.google.com/maps?f=q&source=s_q&hl=en&q=Hazen+Notch+Rd,+Lowell,+Vermont+05847&sll=44.276917,-72.418957&sspn=0.078168,0.128918&ie=UTF8&cd=1&geocode=FVf0qwIdWuOt-w&split=0&ll=44.823612,-72.490883&spn=0.019359,0.032229&z=15&iwloc=A&layer=c&cbll=44.823619,-72.491039&panoid=s_6TgG1fewPjFSxdBFTA_A&cbp=12,267.24,,0,5

[ aside:  google streetview astonishes me sometimes. ]

 > seems counter intuitive how tracktype and path,sac_scale are rendered  
 > but it's a start. it's a chicken-egg situation. if there is no data  
 > there is no reason to change the renderer 

right.

paul
=-
 paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 61.3 degrees)

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Dave Hansen
On Tue, 2009-07-07 at 09:04 -0700, Apollinaris Schoell wrote:
>   but agree tiger data is very bad in this regard. nearly all tracks  
> are defined as highway=residential.

Well, you can call it good or bad, but the fact of the matter is that
TIGER was never really intended to be a road map.  highway=residential
was the fallback for things for which we didn't have a better type.
They were all smaller roads and that seemed to make them render in the
most reasonable way.  I think of it like meaning, "small unimportant
road which doesn't matter unless you're close to it".  You'd never use
these for long-distance travel, for instance.

-- Dave


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Apollinaris Schoell
>
> my pet goal is missing from that list (it's only sort of a part
> of "Wrong road classification"), and that is, missing "surface"
> keys, to indicate whether a road is paved or not.  this is a key
> piece of information for many kinds of navigation (bicycling,
> motorcycling, and for many others in the case of foul weather),
> and it's completely missing (or mostly incorrect) on all other
> free and non-free maps.
>

this is not tiger specific. you can use tags like track_type,  
surface  or come up with better ones.
  but agree tiger data is very bad in this regard. nearly all tracks  
are defined as highway=residential. even the ones which are 4WD roads  
in the USGS maps.

> unfortunately, there's not much incentive to add the information
> to OSM, because none of the renderers will display it anyway.  :-)
>
> (i started a thread on this earlier this year, and then dropped
> the ball by not continuing to lobby for changes to the renderers
> to fix this problem.)

use track_type and this is rendered different in osmarender. for me it  
seems counter intuitive how tracktype and path,sac_scale are rendered  
but it's a start. it's a chicken-egg situation. if there is no data  
there is no reason to change the renderer 

>
> paul
> =-
> paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 58.1  
> degrees)
>
> ___
> Talk-us mailing list
> Talk-us@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/talk-us


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Fixing TIGER

2009-07-07 Thread Paul Fox
stevec wrote:
 > Hi
 > 
 > Just want to highlight to people here some neat things to help fixing  
 > TIGER. There's now a map layer highlighting what needs help:
 > 
 >  http://www.opengeodata.org/?p=626
 > 
 > as well as some HOWTOs on what needs fixing
 > 
 >  http://wiki.openstreetmap.org/wiki/TIGER_fixup

my pet goal is missing from that list (it's only sort of a part
of "Wrong road classification"), and that is, missing "surface"
keys, to indicate whether a road is paved or not.  this is a key
piece of information for many kinds of navigation (bicycling,
motorcycling, and for many others in the case of foul weather),
and it's completely missing (or mostly incorrect) on all other
free and non-free maps.

unfortunately, there's not much incentive to add the information
to OSM, because none of the renderers will display it anyway.  :-)

(i started a thread on this earlier this year, and then dropped
the ball by not continuing to lobby for changes to the renderers
to fix this problem.)

paul
=-
 paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 58.1 degrees)

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


[Talk-us] Fixing TIGER

2009-07-07 Thread SteveC
Hi

Just want to highlight to people here some neat things to help fixing  
TIGER. There's now a map layer highlighting what needs help:

http://www.opengeodata.org/?p=626

as well as some HOWTOs on what needs fixing

http://wiki.openstreetmap.org/wiki/TIGER_fixup

Best

Steve


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us