Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Ian Dees
On Thu, Jun 4, 2009 at 1:36 AM, Ted Percival t...@midg3t.net wrote:

 Paul Johnson wrote:
  4) Remove abbreviations TIGER imported.  Sometimes, I really wonder if
  TIGER was such a hot dataset to import...
 
 http://wiki.openstreetmap.org/wiki/Key:name#Abbreviation_.28don.27t_do_it.29

 I wrote a script to expand TIGER abbreviations into full words. It's an
 addon for the change_tags.py script
 (
 http://svn.openstreetmap.org/applications/utils/change_tags/change_tags.py
 )
 which unfortunately doesn't work with API 0.6 yet - at least last time I
 tried. It does a few other things too, particularly for areas that use
 the grid system.


While most of the time this would work well, I can think of some cases where
making these assumptions with TIGER data is a bad idea


 Its functions are:
 - Strip St suffix from grid-named streets (eg. South 500 West)
 - Collapse multiple spaces into a single space (lots of TIGER)
 - Expand abbreviated directions (eg. S 500 E to South 500 East)
 - Expand abbreviated suffixes (Rd - Road, St - Street, etc)


- Strip St.: is that recommended somewhere? It seems silly to remove data
like that...
- Collapse spaces: Ok, that makes sense.
- Expand abbreviated dirs: This is the one that I have the most problems
with. In my neighborhood in Minnaepolis, the official names for roads
actually end in SE. For example, I live on 6th Avenue SE. I've seen several
different representations of this, but when I ask several different mail
carriers and some GIS folks at the University there, they all said that SE
is the official name, not southeast.
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Paul Johnson
Ian Dees wrote:

 Its functions are:
 - Strip St suffix from grid-named streets (eg. South 500 West)
 - Collapse multiple spaces into a single space (lots of TIGER)
 - Expand abbreviated directions (eg. S 500 E to South 500 East)
 - Expand abbreviated suffixes (Rd - Road, St - Street, etc)
 
 
 - Strip St.: is that recommended somewhere? It seems silly to remove
 data like that...

Until you go out to pretty much any city out in the desert or originally
built by Mormons.  In such cities, 90%+ of the streets are not named to
begin with, locations are purely Cartesian.  The only two streets I know
 have a name in Salt Lake City are State Street and Temple Square, and
I'm not sure Temple Square counts (I'd rather not get too close, to be
honest).  All the other ways are referred to by address, such as 450 S
700 E would mean that the address is located four and a half blocks
south of the Mormon temple on the even side of the street, 7 blocks east
of the temple.

Interestingly enough, if you navigate to cities that have a lack of
street names, you'll see stuff like E 2100 S St in TIGER, even though
this is wrong!

 - Collapse spaces: Ok, that makes sense.
 - Expand abbreviated dirs: This is the one that I have the most
 problems with. In my neighborhood in Minnaepolis, the official names for
 roads actually end in SE. For example, I live on 6th Avenue SE. I've
 seen several different representations of this, but when I ask several
 different mail carriers and some GIS folks at the University there, they
 all said that SE is the official name, not southeast.

I could be wrong on this, but I've been making an exception for
cardinals myself, using the same logic behind NOT using abbreviations
for everything else.  I honestly can't think of any other common
abbreviations that would prevent a



signature.asc
Description: OpenPGP digital signature
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Adam Schreiber
Also in Atlanta, there's N St.  I got directions from google and
thought I was looking for North St.  Man was that a big mistake.

Cheers,
Adam

On 6/4/09, Paul Johnson ba...@ursamundi.org wrote:
 Ian Dees wrote:

 Its functions are:
 - Strip St suffix from grid-named streets (eg. South 500 West)
 - Collapse multiple spaces into a single space (lots of TIGER)
 - Expand abbreviated directions (eg. S 500 E to South 500 East)
 - Expand abbreviated suffixes (Rd - Road, St - Street, etc)


 - Strip St.: is that recommended somewhere? It seems silly to remove
 data like that...

 Until you go out to pretty much any city out in the desert or originally
 built by Mormons.  In such cities, 90%+ of the streets are not named to
 begin with, locations are purely Cartesian.  The only two streets I know
  have a name in Salt Lake City are State Street and Temple Square, and
 I'm not sure Temple Square counts (I'd rather not get too close, to be
 honest).  All the other ways are referred to by address, such as 450 S
 700 E would mean that the address is located four and a half blocks
 south of the Mormon temple on the even side of the street, 7 blocks east
 of the temple.

 Interestingly enough, if you navigate to cities that have a lack of
 street names, you'll see stuff like E 2100 S St in TIGER, even though
 this is wrong!

 - Collapse spaces: Ok, that makes sense.
 - Expand abbreviated dirs: This is the one that I have the most
 problems with. In my neighborhood in Minnaepolis, the official names for
 roads actually end in SE. For example, I live on 6th Avenue SE. I've
 seen several different representations of this, but when I ask several
 different mail carriers and some GIS folks at the University there, they
 all said that SE is the official name, not southeast.

 I could be wrong on this, but I've been making an exception for
 cardinals myself, using the same logic behind NOT using abbreviations
 for everything else.  I honestly can't think of any other common
 abbreviations that would prevent a



___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Dave Hansen
On Thu, 2009-06-04 at 00:36 -0600, Ted Percival wrote:
 Its functions are:
 - Strip St suffix from grid-named streets (eg. South 500 West)
 - Collapse multiple spaces into a single space (lots of TIGER)
 - Expand abbreviated directions (eg. S 500 E to South 500 East)
 - Expand abbreviated suffixes (Rd - Road, St - Street, etc)

So, I looked at doing this when I originally converted the TIGER data.
The issue is that I'm too dumb to come up with anything that worked
universally across the entire country.

This kind of script is useful for small areas that you've looked at
manually, but please don't apply it too widely.  It does the right
actions for sanely-named things, but TIGER is full of goofy stuff.

Consider: St. Helens St..  There are also plenty of semi-mistakes or
weird abbreviations in TIGER that appear to be mistakes.  I wouldn't be
surprised to see Saint Street entered somewhere as

name: St.
type: St.

We don't want to make that Street Street.  That makes it even
worse. :)

Again, these can work in limited areas where the naming is nice and
consistent, but it's really really hard to make it work on a large scale
where things are *NOT* consistent.

-- Dave


___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Ted Percival
Paul Johnson wrote:
 Ian Dees wrote:
 - Collapse spaces: Ok, that makes sense.
 - Expand abbreviated dirs: This is the one that I have the most
 problems with. In my neighborhood in Minnaepolis, the official names for
 roads actually end in SE. For example, I live on 6th Avenue SE. I've
 seen several different representations of this, but when I ask several
 different mail carriers and some GIS folks at the University there, they
 all said that SE is the official name, not southeast.

The script only does word-bounded cardinal directions, so SE remains
SE. That said, it *does* currently bust a few lettered streets in Salt
Lake City (E Street, N Street, etc.). I'll fix that up in the next
version by requiring at least three words in the name.

 I could be wrong on this, but I've been making an exception for
 cardinals myself, using the same logic behind NOT using abbreviations
 for everything else.

I'm not sure why the logic is inverted. While it is common notation to
abbreviate the cardinal directions, the street signs actually say 300
West, and I would prefer voice navigation software to say Turn on
Three hundred West Rather than Turn on Three Hundred Double-U, for
instance. I think the usual principles apply: it's easy enough for
renderers to abbreviate full words when it's appropriate, and routing
software to understand how users might abbreviate their input.
Maintaining unnecessary ambiguity in the database should be avoided.

___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Paul Johnson
Adam Schreiber wrote:
 Also in Atlanta, there's N St.  I got directions from google and
 thought I was looking for North St.  Man was that a big mistake.

OK, so expand cardinals or not?



signature.asc
Description: OpenPGP digital signature
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us


Re: [Talk-us] Tidying up TIGER data

2009-06-04 Thread Paul Johnson
Ted Percival wrote:
 I'm not sure why the logic is inverted. While it is common notation to
 abbreviate the cardinal directions, the street signs actually say 300
 West, and I would prefer voice navigation software to say Turn on
 Three hundred West Rather than Turn on Three Hundred Double-U, for
 instance. I think the usual principles apply: it's easy enough for
 renderers to abbreviate full words when it's appropriate, and routing
 software to understand how users might abbreviate their input.
 Maintaining unnecessary ambiguity in the database should be avoided.

Well, you hit the nail on the head, I figured expanding cardinals is
about as trivial like creating abbreviations for particular words
automatically as needed.  If the general consensus is that we should
expand those, I will.



signature.asc
Description: OpenPGP digital signature
___
Talk-us mailing list
Talk-us@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk-us