Re: [mkgmap-dev] Address city country name assignment.
Am 16.02.2011 11:32, schrieb Christian Steins: On Wed, 16 Feb 2011 08:00:48 +0200, Du Plessis, Bennie wrote: I don't understand which tags are used to find the country (and other address data) for a street, or city to use in address search. I think we should have a wiki page describing all this. -how do the different location-autofill options work? -what is the algorithm for getting the region for a place? -how are streets assigned to cities? -what is the purpose of the LocatorConfig.xml file? Don't forget the importanrs point: How to use it. :-) Henning ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Address city country name assignment.
I don't understand which tags are used to find the country (and other address data) for a street, or city to use in address search. [First I should say that I did not write any of the code in question and haven't looked at it much at all so I don't really know how it works in any detail] It should use the is_in tags where they are present. If they are not it tries to find something nearby with tags and use them, that depends on the autofill option. It is fundamentally unreliable though since this happens after tiles are split, and tiles are processed one at a time. So even in cases where the closest town is correct, if it happens to be on another tile it will not be found. I understand the address search function is still shaky, but can you say where a point / way will get its country info from? Ha! I think it was shaky, but now it is solid enough to expose the lack of good data or processing of the data that exists. With the city nodes containing the wrong country name, I suppose any street near it gets assigned that incorrect country name. Yes, probably. Will it help to play with is_in tags with the style files, or is the index created before the style file changes are done? It might help, the index is created last, but all the information that it requires is in the tiles themselves. Is boundary information considered at all? Because boundaries in my map is still faulty. A lot of boundaries are not rendered by mkgmap. Some No, boundary data isn't used for this. I know nothing of code language, but would like to read through it anyway. Where can I see the code that assigns the address info to the street / cities? There is code in StyledConverter.java Locator.java MapBuilder.java It is quite likely that there are bugs or that it doesn't quite work in the intended way. I found a bug just by looking at the code while researching this answer! It would probably be best to start with a tiny OSM file, containing only cities, then add a few streets, POIs as you work out what is happening. ..Steve ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Address city country name assignment.
On 16/02/11 10:42, Henning Scholland wrote: -how do the different location-autofill options work? -what is the algorithm for getting the region for a place? -how are streets assigned to cities? -what is the purpose of the LocatorConfig.xml file? Don't forget the importanrs point: How to use it. :-) We could also consider having a different file format. It may be better to have a file for each country to make it easy for people to contribute for their country. I've nothing against XML, but for this kind of configuration a list of key=value pairs works just as well. eg: full-name = United Kingdom abbreviation = GBR etc... They can be combined during the build if that makes sense. ..Steve ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Address city country name assignment.
It's amusing and not particularly surprising how, as soon as we have searchable maps, we discover the importance of having better addressing information about locations. So far a lot of a fundamental principles have been mentioned: * That using is_in information is easy, but not satisfactory, since it's often missing, inconsistent, poorly maintained and hard to use to infer a hierarchy of belonging (arbitrary bits of streets usually don't have it set, so how do you make a best guess of what nearby element should own it? * That boundary polygons are increasingly present on our map, that they can solve most of the problems of is_in, that they are already succeeding is_in for other address-sensitive applications in OSM, but that they are very hard to process as part of how mkgmap processes the map. I am convinced that is_in is never going to give us satisfactory results, that we cannot trust the values entered in that field by mappers and that, the more boundary polygons are used to solve other problems, the less is_in will even be maintained. I have not been entering is_in in my mapping for at least two years, at most I will correct entries by others. Mkgmap needs to, at those parts of the process where address hierarchy information is currently inferred, be capable of querying an external source to find the required information. Because at least some of my ideas for a possible source are a little cumbersome, it would probably be ideal if a number of options are permitted, rather like how drawing the sea is managed. One of the address lookup plugins would probably be the existing simple one based on is_in, for users who want to avoid extra prerequisites. So if that's what a simple, poorly-functioning address plugin looks like, what would the best one look like? Right now, the ultimate OSM geocoder is Nominatim. It is capable of consuming a place name or co-ordinate (of a road segment, say) and deducing an address hierarchy. It already uses the best clues available to do this - including both boundary polygons and is_in tags. And because an entire hierarchy is deduced, it offers us the flexibility to index locations under more than one hierarchy element, as many commercial Garmin maps seem to. For instance, my current location might reasonably be searched for under any of the following names in the city field: Dublin (city of which my location is a suburb) Dublin (historical county where I am located) Dublin 15 (postal district) Blanchardstown (Historical village and focus of modern suburb) and there are even sub-parts of Blanchardstown, typically corresponding to old rural townlands that might be searched for: Corduff, Ongar, Carpenterstown. Only the most disciplined maintainer of is_in will capture enough information to permit matching on all of these elements and there is no way sufficient consistency will exist. So a Nominatim lookup is the way to go, as we export all of the problems to an externally maintained tool. The snag: Even though Mapquest, who currently host the biggest public Nominatim instance, are very generous with the level of API lookups they allow there will be trouble if every mkgmap user performs thousands of Nominatim lookups when refreshing their Garmin maps. It will also be slow and bandwidth-intensive. This can be solved somewhat by having one's own instance of Nominatim, possibly containing only an interesting subset of the map. It would very likely prove worthwhile to define a cache file format into which to stuff those results of the query that mkgmap will require. If these cache files were maintained by country of bbox, they could be calculated centrally by people with sufficient hardware or expertise, then made available for download by normal users. This is a lot like what Steve suggests above, but without the expectation that mappers maintain the address file (because they just plain won't, and the required information is already available from Nominatim, so it would be a waste anyway). I'm interested in your comments on this. While to do what I describe certainly requires some hard work, it's all front-loaded, once we can find a working framework we never have to worry about it again. Well, not until Nominatim is superseded by an even more awesome geocoder. Dermot -- -- Igaühel on siin oma laul ja ma oma ei leiagi üles ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Address city country name assignment.
On Feb 16, 2011, at 15:20, Dermot McNally wrote: On 16 February 2011 13:28, Robert Vollmert rvollmert-li...@gmx.net wrote: My suggestion would be to move region (country, city) detection into a preprocessing step, outside of mkgmap. That is, some other tool preprocesses and normalizes the osm data and assigns consistent is_in tags. Then mkgmap looks only at the is_in tags. That was actually my first thought on how to improve things. But the problem is that you would potentially be doing this: * Obtain hierarchy where the role of each element is known * Flatten this into an ordered list of elements, still respecting a hierarchy as best you can, but discarding the knowledge of roles * Bring this flattened version of the data into mkgmap and attempt to use the existing hints to extrapolate the knowledge of the roles that you only just threw away in the last step. So instead of writing is_in tags, write the actual data as relations? One relation per street, one relation per city, etc., with all streets as members in the corresponding city relation? Cheers Robert ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Address city country name assignment.
On 16 February 2011 14:40, Robert Vollmert rvollmert-li...@gmx.net wrote: So instead of writing is_in tags, write the actual data as relations? One relation per street, one relation per city, etc., with all streets as members in the corresponding city relation? I wasn't thinking of trying to force the address data into OSM format at all, but rather to store it away in a crude but persistent hash of a sort that mkgmap could learn to consult. Part of my reason for this is that I think the actual lookup of address data will be expensive, so you won't want to refresh these data as often as you will your planet extract - or at least, you'll want to confine you lookups to places that either aren't in your cache because they are new or that you have reason to suspect that a refresh is needed. If the hash is keyed on a combination of OSM ID and version, that could be the trigger for a fresh lookup from Nominatim or whatever service proves most useful. Dermot -- -- Igaühel on siin oma laul ja ma oma ei leiagi üles ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] Address city country name assignment.
Hi, I don't understand which tags are used to find the country (and other address data) for a street, or city to use in address search. In my build of Southern Africa, the streets around Nelspruit gets assigned country Botswana, although it is in South Africa, and close to a place= city node that contains is_in data: is_in= province, region, country, continent also is_in:country = South Africa. Similar strange things is that Johannesburg city node falls in Lesotho, while Johannesburg North also a point, not 5 Km from there falls in Botswana. Both with is_in data that shows it in South Africa. And Gaborone, capital of Botswana falls in South Africa. All these despite the fact that these cities have is_in tags. Also all of these in the same tile, but I have since assumed that the country info is not uniform through a tile. I understand the address search function is still shaky, but can you say where a point / way will get its country info from? Should MkGMap not first inspect the tags on the entity itself - it doesn't seem to happen, as cities gets assigned new country names although it contains is_in info. With the city nodes containing the wrong country name, I suppose any street near it gets assigned that incorrect country name. Maybe I am doing something wrong, and not MkGmap. What is_in information does MkGMap expect for a city. Is only the nearest city considered for a street's address data, or will a town / village that is closer be taken? Will it help to play with is_in tags with the style files, or is the index created before the style file changes are done? Is boundary information considered at all? Because boundaries in my map is still faulty. A lot of boundaries are not rendered by mkgmap. Some case where rivers are used as boundaries as well, and other cases that are simply pure misteries. I know nothing of code language, but would like to read through it anyway. Where can I see the code that assigns the address info to the street / cities? Regards Bennie Scanned by MailMarshal - Marshal's comprehensive email content security solution. Download a free evaluation of MailMarshal at www.marshal.com ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev