Clifford,

I think the thing right now is to really understand the problem in
depth. The issues the community gets into and complains about
regarding automated edits is when they come out wrong. For example,
the expansion that was run before that was turning all E's into East.
That's very bad and we need to make sure that doesn't happen.

We have to make some tradeoffs between what we can reasonably assume
and what we know absolutely to be true. For example, if I see "Rd" at
the end of a highway=residential way, I'm pretty sure that is a
contraction for Road. Of course if there's a street somewhere named
"Main Saint" rather than Main Street, well, it will be wrong, and that
will be bad, but hopefully this kind of problem can be minimized if
we, for example, try to match the roads up with newer TIGER data road
names and use the TIGER metadata, or any local street address data
which we can use to validate against.

This is why I want to map this problem visually, to see if there are
localized clusters of problems and to see if we can reduce the
problems by using local data sources alongside the software's
"educated guesses".

And we also need to realize, as a community, that automated edits,
like manual edits, will never be 100% correct. I'd be happy with 99.5%
correct. That's better than the rate of typos and other problems we
see with our normal mappers. The remaining .5% will be something that
either local community members will fix, or some further iteration of
a tool will fix.

So if you have expertise in Tilemill, I'd love the help in setting up
some tiles that show probable abbreviations.

- Serge



On Wed, Jul 30, 2014 at 3:44 PM, Clifford Snow <cliff...@snowandsnow.us> wrote:
>
> On Wed, Jul 30, 2014 at 9:48 AM, Serge Wroclawski <emac...@gmail.com> wrote:
>>
>> Thanks for the pointer. I think you're right that in this case
>> especially, there's no reason to admonish anyone, but perhaps we can
>> examine the data and see if there's a safe way to expand it, like we
>> did the TIGER data.
>>
>> That may also explain some large portion of the contractions I found.
>> Maybe it's worth trying to map them....
>
>
> I agree with Serge and Toby's post - this is old data that should be fixed.
>
> I pulled the data from a Mapzen city extract (San Diego & Tijuana). There
> are 488,571 address points. Looking at OSMI, the whole area has addresses
> with abbreviated street names.
>
> Serge, anything I can do to help you with the expansion bot, just let me
> know. Knowing how to do this would help me down the road as we try to figure
> out how to update address info from my local county.
>
> Clifford
>
>
> --
> @osm_seattle
> osm_seattle.snowandsnow.us
> OpenStreetMap: Maps with a human touch

_______________________________________________
Talk-us mailing list
Talk-us@openstreetmap.org
https://lists.openstreetmap.org/listinfo/talk-us

Reply via email to