Hi everybody,

I’m in the quest for a geocoder for OSM that is fault-tolerant in regards of 
miss-spelled search terms.

The company I’m working for does different projects for customers in the 
logistics field. From every customer we receive several hundred thousand 
address-records, which we have to geocode in order to do different 
calculations. I started to use Nominatim for that (on an own installation), but 
it seems that Nominatim has not much of tolerance regarding miss-spelled street 
and city names. Especially on our last project in Russia it turned out, that 
street- and city-names often include abbreviations in different ways (like 
„street“, „str.“, „s“, …). Since we receive the address information from our 
customers, we have not much influence on the quality of the data. So there are 
not just these valid abbreviations, but also real spelling errors. Nevertheless 
we have to geocode as much of these addresses as possible. 

But right now, Nominatim throws out around 40% of the addresses, not finding 
anything, although the address is in OSM and could be found (just slightly 
different spelled). What I would expect is, that a geocoder gives me back some 
kind of answer for every question I ask, being it an exact match on the city or 
on the street, or only a „similar“ match. It should tell me if there was no 
100%-match, there were several records found, matching my street or my city 
from e.g. 80% to 50%. So then I can decide later on which records I consider a 
match and which not. In any case the first row returned should be the best 
match available.

So I have a couple of questions here: 

Does anybody know of a geocoder for OSM-data that does this already? 
I found besides Nominatim there are several other geocoders. But I cannot test 
them all. Maybe some work already this way.

There is a Postgresql-module that seems to do just what I want: pg_trgm. It 
does not seem like Nominatim uses that right now.
Is there anybody already working on implementing this (or anything similar)?

If not, I would be willing to invest further time and effort into this, but I 
need some help on the internals of Nominatim, which I’m not firm with. 
Where would be the right place to integrate this into Nominatim? 
Does it make sense to try to put this into Nominatim?
Or would it be easier to use just osm2psql and build on top of that a new 
query-interface?


Thanks a lot for anybody who can help me getting forward with this issue!

Best regards,

Tom


PS: I put this on both mailing list, 'dev‘ and 'geocoding', since I’m not sure 
where it suits better. Please excuse me if this is wrong!
_______________________________________________
dev mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/dev

Reply via email to