At 2010-04-21 17:12, andrzej zaborowski wrote:
On 22 April 2010 01:18, Apollinaris Schoell ascho...@gmail.com wrote:
On Wed, Apr 21, 2010 at 3:36 PM, andrzej zaborowski balr...@gmail.com
wrote:
Where's damage in that -- is it in that you can now read the name out
without checking the documentation for what that funny string means in
that particular database that is TIGER?
I just had a machine crash as I was trying to find stats, but I'll bet that
at least 90% of the cases are St, Ave/Av, and Blvd/Bl, with the
occasional Ln and Cir/Cr thrown in. When there's a lone N, S, E, or W
as a prefix to a street name, it's clear to everyone what that means. These
are the same abbreviations that _everyone_ uses every day - children,
adults, businesses, governments, etc.
Even when travelling to another country, it takes me very little time to
understand what common abbreviations are used for in addresses.
there is damage by doing it wrong, others have pointed to it already.
And I will do so again. My problem is mostly that this was done without a
safety net. You clobbered existing data with no easy way to walk it back.
The existing name value should have been put in a foo_name tag so we could
at least see what used to be. I would at least encourage that a bot be run
to find these edits, find the previous version in history, and do this, if
we can't soon agree on a better schema to split the name up into components
at the same time.
I am not deep enough into the history of the abbreviations used and who
defined them. But I am pretty sure there is a lot of errors.
Errors that I, and a lot of other mappers, painstakingly fixed by hand,
based on ground surveys and research into public records. In particular,
I'm worried about the cases where I spelled out North because it was
actually part of the name, as opposed to a cardinal direction related to
addresses, which I left alone, hoping to later move the latter directions
to a addr:direction_prefix tag, while leaving the former along. I can no
longer distinguish between the two.
I don't know who defined the ones used in TIGER but this is not the
only way to abbreviate the names, that is proven by USPS having their
own list that is not identical. The most popular words will be the
same in both lists but some are really cryptic and arbitrary, could as
well be numeric codes. Then TIGER also includes Spanish names and the
list has abbreviations for those too, which rarely anyone in US can
read, while they can cope with unabbreviated ok.
I don't agree. Much of the US speaks Spanish. Many more possess the
tremendous brainpower and enoUGH grade-school Spanish required to know that
Cl. in front of a street name might mean Calle or Cam. might mean Camino,
or that S means Sur and N means Norte.
- in the city I live there is no street sign with street, avenue,
boulevard,
and even more surprising there are no abbreviations either. osm
principle is to map what's on the ground. So tiger import is definitely
wrong and expanding the names is also wrong. on the other hand postal
address usually use it in one or the other form so it's not completely
fiction.
Exactly. Many places in Orange County have the bad habit of leaving the
suffix off the large street signs at intersections, perhaps as a way of
saving space to reduce sign size and cost. Just because the big sign says
just Orange doesn't mean that the street's real name is Orange Street, nor
that it shouldn't be entered into any reasonable database or map that way.
map what's on the ground is the wrong thing to do so often that I don't
really understand why it was decided upon, nor why people continue hold it
up on a pedestal, despite continuing problems with it.
For the record street signs on different ends of the same street often
use different forms and you'll sometimes find really strange
conventions, so while I agree mapping what's on the ground is good
because stuff can be confirmed, in this case it's not a solution. In
many places you'll find the names are all caps on the signs but in a
local newspaper they're capitalized the usual way.
And the signs are sometimes wrong. In the thousands of streets I've
photographed and mapped, I've corrected hundreds of signage
errors/inconsistencies, often requiring substantial research into records,
and resulting in notification of the appropriate authority to fix the
records and/or signs (for free :( ).
- many geocding engines do not find expanded names. even google doesn't in
many cases. To me it looks like nearly anyone doesn't use the expanded name
at all. So my question is is the expanded name really the correct name?
Exactly! Sounds like it's only useful purpose is text-2-speech. Here's what
I'd like to see:
name: The pre-balrog name
name_direction_prefix: The 1-2 char cardinal direction before the root
use_name_direction_prefix: {yes|no} Yes indicates that the
name_direction_prefix