Re: [mkgmap-dev] Address search and index.

2011-02-18 Thread Steve Hosgood
On 16/02/11 19:39, WanMil wrote:
 I think you assume that boundary information is stored as single
 polygon. 

That was true, but...

 This won't be accepted by the OSM community and is not how 
 boundaries are tagged at the moment. Larger structures are realized as 
 multipolygons. This makes sense because a national border is always a 
 border of a county and a city etc.

OK - I can accept that maybe that makes it a bit more difficult to
check, but my earlier comments about how thing next ought still to be
true, and OSM already has multipolygon solving algorithms.

   There no substantial difference between coastline and 
 boundary processing.


Fair enough.
Steve

___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-16 Thread Steve Hosgood
On 15/02/11 18:16, WanMil wrote:

 1st problem: Splitter (as you already mentioned)
 The tiles do not contain the full information for multipolygons that
 exceed the tile bounds
 It may be possible to pre-process the planet file to divide the world
 into (say) 1° by 1° squares and pre-tag each one with any outer
 bounding-polygon information which applies to the whole of that tile.
 When a splitter produces an output file it will know that level of
 information instantly...
 I like the idea of creating smaller tiles with consistent information. I 
 would not implement this in splitter. I have implemented the 
 multipolygon algorithms for mkgmap and that was hard work until it 
 reached the current quality. Handling boundary data is nothing else than 
 multipolygon handling. We don't need to reinvent the wheel and can use 
 mkgmap (at least the mkgmap codebase) to do that.

Excellent. After further thought I suspect that my idea of 1° x 1° tiles
would be too big - you'd want to choose a tile size so that several
would be used to create a typical user's request from the splitter.
 By the way: The coastlinefile processing has not such a mechanism. It 
 simply loads all coastline data from one file and keeps that in memory. 
 That's not a good choice for large areas. This could be tuned.


Yeah - I'm surprised it doesn't do that already.


 I would propose a suite of sanity-checker programs that should be 
 periodically run over the planet
 file looking for broken polygons.
 Ok, that's not mkgmaps task. There are already some fine tools like the 
 WayCheck suite. Maybe they could be tuned to do that. In the end people 
 will set more value on valid boundary data if mkgmap makes use of it.


That's true - and remember it won't just be used by mkgmap. If (when?)
there is ever a mktomtommap project, they would need it too. So would
the slippymap gazetteer and any route-planner software.

 It should be easier to maintain than
 the coastline data because the political (or adminstration??) bounding
 polygons should obey certain rules that could be checked automatically.
 Can you give an example? I don't see why it should be easier than 
 coastline checking.


1) A bounding polygon ought to form a closed way. Coastlines seem to be
composed of many open ways that happen to share end-points (more like
roads). That latter fact makes it harder to check a coastline since a
way flagged as coastline of France will at some point share
endpoints with a way flagged as coastline of Spain, and that's true
but not checkable automatically.

2) A bounding polygon's name ought to be unique within its own parent
bounding-polygon. If for some reason you really want two
bounding-polygons to be disjoint, but refer to the same logical place,
you'd put them both in a relation.

3) There should not be an intersection of two bounding polygons at the
same level. In other words, two towns can't intersect.

4) There are such things as cities that span multiple counties (like
London, England for instance). There needs to be rules to allow that,
maybe where the bounding-box for the majority of the county of Middlesex
is required to be entirely outside the bounding box for London, but a
second bounding-box (also flagged as 'county of Middlesex') is placed
inside the bounding-box for London, but a member of a relation with
the other Middlesex to prove that they are supposed to be the same. If
the 'relation' was omitted, the auto-checkers would barf because you'd
have two disjoint places called Middlesex inside the same parent
bounding box (England). (You might choose to relax that rule if the two
Middlesex bounding-boxes shared one or more periphery vectors, as they
would here.)

5) The political:layer tags should increase in value as the
bounding-boxes nest. In the unusual case of enclaves that I mentioned in
my earlier post, you allow the layer value to break that rule (within
the enclave the inherited knowledge of the parent bounding boxes is
forgotten), but there has to be a tag to say political:enclave=yes or
something to let it happen without the auto-checher barfing.

6) Er - there must be more!


 The outer boundary for a land-locked country will consist of a LOT of
 data, agreed. For island nations it will be vastly less because you can
 draw a crude polygon off-shore to encompass all the land area. And you'd
 have to do that, because you want to include all the minor off-shore
 islands. You want an extreme example, look at Greece! But the outer
 polygon for Greece might not be all that detailed yet still do the job
 correctly - except for the northern land-border of course which will be
 crazy.
 Sounds easy although I have no idea how to put that into a good 
 algorithm that does not exhaust common memory and processor configurations.
 Example:
 * How do you want to detect island nations?


Sorry - I didn't make myself clear. No-one has to detect island nations.
Some mapper draws the polygons! But the mapper will have an easy time

Re: [mkgmap-dev] Address search and index.

2011-02-16 Thread Colin Smale

 4) There are such things as cities that span multiple counties (like
 London, England for instance). There needs to be rules to allow that,
 maybe where the bounding-box for the majority of the county of Middlesex
 is required to be entirely outside the bounding box for London, but a
 second bounding-box (also flagged as 'county of Middlesex') is placed
 inside the bounding-box for London, but a member of a relation with
 the other Middlesex to prove that they are supposed to be the same. If
 the 'relation' was omitted, the auto-checkers would barf because you'd
 have two disjoint places called Middlesex inside the same parent
 bounding box (England). (You might choose to relax that rule if the two
 Middlesex bounding-boxes shared one or more periphery vectors, as they
 would here.)

Middlesex is not a county any more, it's a historic county which doesn't 
have a place in any administrative hierarchy. Same with Berkshire and 
Avon, for example. A polygon to represent Middlesex doesn't need to bear 
any relationship to the boundary of Greater London, various London 
Boroughs, Hertfordshire etc.



___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-16 Thread WanMil
 It should be easier to maintain than
 the coastline data because the political (or adminstration??) bounding
 polygons should obey certain rules that could be checked automatically.
 Can you give an example? I don't see why it should be easier than
 coastline checking.


 1) A bounding polygon ought to form a closed way. Coastlines seem to be
 composed of many open ways that happen to share end-points (more like
 roads). That latter fact makes it harder to check a coastline since a
 way flagged as coastline of France will at some point share
 endpoints with a way flagged as coastline of Spain, and that's true
 but not checkable automatically.

 2) A bounding polygon's name ought to be unique within its own parent
 bounding-polygon. If for some reason you really want two
 bounding-polygons to be disjoint, but refer to the same logical place,
 you'd put them both in a relation.

 3) There should not be an intersection of two bounding polygons at the
 same level. In other words, two towns can't intersect.

 4) There are such things as cities that span multiple counties (like
 London, England for instance). There needs to be rules to allow that,
 maybe where the bounding-box for the majority of the county of Middlesex
 is required to be entirely outside the bounding box for London, but a
 second bounding-box (also flagged as 'county of Middlesex') is placed
 inside the bounding-box for London, but a member of a relation with
 the other Middlesex to prove that they are supposed to be the same. If
 the 'relation' was omitted, the auto-checkers would barf because you'd
 have two disjoint places called Middlesex inside the same parent
 bounding box (England). (You might choose to relax that rule if the two
 Middlesex bounding-boxes shared one or more periphery vectors, as they
 would here.)

 5) The political:layer tags should increase in value as the
 bounding-boxes nest. In the unusual case of enclaves that I mentioned in
 my earlier post, you allow the layer value to break that rule (within
 the enclave the inherited knowledge of the parent bounding boxes is
 forgotten), but there has to be a tag to say political:enclave=yes or
 something to let it happen without the auto-checher barfing.

 6) Er - there must be more!

I don't understand, sorry.
I think you assume that boundary information is stored as single 
polygon. This won't be accepted by the OSM community and is not how 
boundaries are tagged at the moment. Larger structures are realized as 
multipolygons. This makes sense because a national border is always a 
border of a county and a city etc. And editors like JOSM don't need to 
download the complete border of two countries just because you want to 
edit a small street crossing the border.
Please have a look at the 
http://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative and 
http://wiki.openstreetmap.org/wiki/Relation:boundary pages. Most of your 
ideas are already described there and can be found in the OSM data.

As a result you have a bunch of lines and some polygons both for 
coastlines and for boundaries. You need to connect the open lines 
(boundaries use the multipolygon information to do this). In the end you 
have uncomplete data where you have to close some polygons 
automatically. There no substantial difference between coastline and 
boundary proessing.

WanMil
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-15 Thread Steve Hosgood
On 14/02/11 18:09, WanMil wrote:
 Am 14.02.2011 13:20, schrieb Steve Hosgood:
 On 14/02/11 10:21, Steve Ratcliffe wrote:
 If there are faults in the data, they should be fixed.

 It's really an issue to be debated at OSM level, not mkgmap level, but I
 have considered for quite a while now that the is_in tag should be
 entirely deprecated in favour of a concept of boundary polygons. Is_in
 is fraught with problems, some illustrated above.
 Steve, that was my first idea and I think that's the only solution to 
 the problem. But it's not that easy to realize.
I never said it would be easy! But in the long run, it would solve many
problems. Mkgmap already illustrates the problems inherent in the
existing system. But any navigator software using OSM data has the same
problem - just look at the search tab of the main OSM slippy-map, or any
of the web-hosted route-planners. Their street searching systems are
better than what we have currently on Garmin (even with Steve
Ratcliffe's latest and best efforts), but even so they are a bit dodgy.

 I propose two solutions.
 1. Quick fix
 Use the Locator.xml to merge different notations of the same country / 
 region name. This won't be perfect but will probably fix the most 
 obvious problems for a first release of the index branch.


This will probably have to be human-driven to have any chance of
success. But yes, it could be done. It would get rid of the
inconsistencies, but the problem of working out to which village a given
street belongs, when there are two villages nearly touching - that will
still remain.

 2. OSM boundary data (the general solution)
 It sounds great to use the OSM boundary data but there are some pitfalls 
 we need to go around. I'll list the pitfalls here. Maybe someone finds 
 an easy solution for them.

 1st problem: Splitter (as you already mentioned)
 The tiles do not contain the full information for multipolygons that 
 exceed the tile bounds. I don't think that this will be easy fixable. 
 You would need to implement a complete multipolygon handling in splitter 
 to decide which data must additionally added to a single tile. That's a 
 big deal and will consume lots of resources.
It may be possible to pre-process the planet file to divide the world
into (say) 1° by 1° squares and pre-tag each one with any outer
bounding-polygon information which applies to the whole of that tile.
When a splitter produces an output file it will know that level of
information instantly and will only have to work hard to shrink any
bounding-polygon whose border actually crosses the area of interest.

But we must be doing this (or something like this) already for
coastlines, yes?
 2nd problem: Incomplete data
 The boundary data has a similar structure to the coastline data. The 
 coastline processing is working now with mkgmap but the failure rate is 
 quite high. Only a single OSM data failure can cause the complete 
 workflow to fail.

Yeah - this is indeed an issue. But any map is only as good as its data.
If the data is wrong it must be fixed. I would propose a suite of
sanity-checker programs that should be periodically run over the planet
file looking for broken polygons. It should be easier to maintain than
the coastline data because the political (or adminstration??) bounding
polygons should obey certain rules that could be checked automatically.

 3rd problem: Amount of data
 A solution for pitfall 1 (and 2) could be to provide quality checked 
 extra data containing boundary information only. This is already 
 available for the generate-sea processing. You can provide the coastline 
 data in a separate file. But the amount of data will be VERY high. I 
 don't think that it is a good thing to have minimum memory requirements 
 of some GB.
 So in the end we would need to throw away the tile concept and implement 
 a database interface for mkgmap.
 Maybe that's the solution?

The outer boundary for a land-locked country will consist of a LOT of
data, agreed. For island nations it will be vastly less because you can
draw a crude polygon off-shore to encompass all the land area. And you'd
have to do that, because you want to include all the minor off-shore
islands. You want an extreme example, look at Greece! But the outer
polygon for Greece might not be all that detailed yet still do the job
correctly - except for the northern land-border of course which will be
crazy.

Steve

___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Henning Scholland
Hi

Am 13.02.2011 23:52, schrieb Steve Ratcliffe:
 There is a big problem with address search with poor and inconsistent map 
 data.  With my map of England I get the choice of four different variations 
 on the name of the country to choose from. One must be chosen and doing so 
 means that you can only find streets that have that particular country name.

 Now there is no way for me to know since after all everything is really in 
 the same country.

 So to make address search really useful the country names have to be cleaned 
 up, (other countries may be in a better state).

 Does anyone have any good ideas about this?

 ..Steve

Do you think, it's an osm-data-problem? Then it would be very helpful, 
to explain, which tags are involved and causes the faults. Is it is_in, 
addr:country or which tags did mkgmap use therefor?

If there are faults in the data, they should be fixed.

___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Lambertus
On 2011-02-13 23:52, Steve Ratcliffe wrote:
 There is a big problem with address search with poor and inconsistent map 
 data.  With my map of England I get the choice of four different variations 
 on the name of the country to choose from. One must be chosen and doing so 
 means that you can only find streets that have that particular country name.

 Now there is no way for me to know since after all everything is really in 
 the same country.

 So to make address search really useful the country names have to be cleaned 
 up, (other countries may be in a better state).

 Does anyone have any good ideas about this?

 ..Steve
 ___

Don't know if it's a *good idea*, but, anyway..

Define a list of countries for Mkgmap and ignore all other countries. 
That will stimulate users to fix the data because they can't find a place.
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Du Plessis, Bennie
What about enabling the template.args file or options to manually
override a tile's country-name? 
I get another problem that is slightly related: In a tile that is 90 %
country A, but contains the capital of country B, the country name of
the tile becomes Country B in the address search. If I can specify the
country name of a tile I can manually control that. I tried
--country-name=Country A in the options, but it doesn't seem to have an
effect. Is it not suppose to override the country name for the map, or
does it fall away now with address search?

BTW congratulations with the address search break through. 
I am so excited I cannot sleep!

-Original Message-
From: Steve Ratcliffe [mailto:st...@parabola.me.uk] 
Sent: 14 February 2011 12:22
To: Development list for mkgmap
Subject: Re: [mkgmap-dev] Address search and index.


 Do you think, it's an osm-data-problem? Then it would be very helpful,
 to explain, which tags are involved and causes the faults. Is it
is_in,
 addr:country or which tags did mkgmap use therefor?

 If there are faults in the data, they should be fixed.

Well I get England, Great Britain, Great Britian, and United Kingdom.

One is spelling mistake and so, fair enough, should be fixed and there
is not going to be any argument from anyone about that.

But the others are not wrong and might be fine in other situations.

A few examples from England:

k='is_in' v='Nantwich, Cheshire, England, United Kingdom'
k='is_in' v='UK, England, County Durham, Teesdale'
k='is_in' v='England, Essex'

So there are different tagging styles and conventions, I don't think
we can change how the mapper map apart from fixing clear errors.

There is code in mkgmap, written by Bernhard Heibler, that attempts to
make sense of these differences and I guess the first thing we need to
do is get that configured as well as possible.

Looking at LocatorConfig.xml it appears that I can specify all the
varients of a country name and it will change them to the main name.
I'm just about to try this out.

..Steve



Scanned by MailMarshal - Marshal's comprehensive email content security 
solution. Download a free evaluation of MailMarshal at www.marshal.com



___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Steve Hosgood
On 14/02/11 10:21, Steve Ratcliffe wrote:

 If there are faults in the data, they should be fixed.
 Well I get England, Great Britain, Great Britian, and United Kingdom.

 One is spelling mistake and so, fair enough, should be fixed and there
 is not going to be any argument from anyone about that.

 But the others are not wrong and might be fine in other situations.

 A few examples from England:

   k='is_in' v='Nantwich, Cheshire, England, United Kingdom'
   k='is_in' v='UK, England, County Durham, Teesdale'
   k='is_in' v='England, Essex'


It's really an issue to be debated at OSM level, not mkgmap level, but I
have considered for quite a while now that the is_in tag should be
entirely deprecated in favour of a concept of boundary polygons. Is_in
is fraught with problems, some illustrated above.

The country of England should have a polygon around its border with the
tag country=England and therefore could also be tagged with
country:cy=Lloegr (and other language specifics). Any item within that
polygon would automatically considered is_in=England (and Lloegr to
allow for searching in Welsh). Within the England polygon would be a set
of non-intersecting polygons each tagged county=wherever and again
some of them might have foreign language variants - which would be fine.
Any item within those polygons would automatically considered
is_in=Cumbria or wherever (plus foreign-language variants of county
names where they exist). The nesting would continue right down to hamlets.

Notice how quickly you could fix a mistake - no need to trawl through
millions of is_in tags looking for inconsistencies, spelling mistakes
etc. Just fix the correct enclosing polygon.

There is a war memorial in England (not far from London) which is
officially in the USA! I think this is called an enclave... but even
an enclave situation could be handled by polygons. Why should there not
be a polygon around that site claiming country=United States??

The interesting question from a polygon-parsing point of view is whether
you'd need to establish a fixed heirarchy of tags each with levels
assumed so that if you encountered country= within a county=
polygon, that the county= would be forgotten about within the inner
country= polygon. This would make sense - if that war memorial was
in the USA, it can hardly also be in Middlesex (or wherever) which
is a county of England. But see below for an alternative system...

Back to England again. If England is a country, what is the UK
deemed to be? In some ways the country should be UK, and England,
Wales, Scotland and Northern Ireland are states. But that's not
how any resident of the UK would see it. It's just down to semantics.

You could get rid of this what's it called argument by requiring the
outer polygon to be tagged political:level=1
political:designation=country name=United Kingdom alt-name=UK.
Further in you'd get a polygon tagged political:level=5
political:designation=county name=Kent and eventually (inside that)
maybe a polygon tagged political:level=6 political:designation=town
name=Canterbury. This removes the need for a fixed known hierarchy of
polygons (mentioned above when disussing the handling of enclaves) - if
you encountered a polygon with a given political:level= it would,
within its bounds,  cancel any supposed enclosing polygon with an equal
or higher numerical political:level= tag.

It also allows an easy way to handle the cases where some cities in the
UK are considered to be counties in their own right (Swansea for
instance - it is inside a polygon of political:level=5
political:designation=county name=South Glamorganshire but the city
itself could be polygonned as political:level=5
political:designation=city and county name=Swansea name:cy=Abertawe.

In order to implement anything like this, we'd need a way to work out
quickly for any point on a map what the enclosing hierarchy of polygons
would be. That in turn probably means having to implement knowledge of
it in the splitter so that a small part of the planetary map still has
intact (but truncated) nested polygons.

PS: I guess Great Britain would be handled by a polygon around just
the right-hand island of the British Isles (and its sub-islands)
claiming geographical:level=1 name=Great Britain name:de=Groß
Britannien name:fr=Grande Bretagne etc.

Mkgmap would probably not be interested in geographical:* tags.


___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Minko
Good to hear that the mystery about the address search is revealed. :-)

I made a test map with mkgmap-index-r1850.jar a 9 tiles, mainly in NL
and one tile partly in Germany (area of Emmerich am Rhein).
Send it succesfully to the GPS (Dakota and Nuvi).

Here are a few test results:

Used country-name=Nederland country-abbr=NL. 

Address search on the GPS works!

Address search shows two options: Spell country and Nederland

Spell country shows: Deutschland and Nederland

Searching for a place name with E in Deutschland shows a few place names, 
starting with E, like Ellecom, DEU but this place is not in Germany at all.
http://www.openstreetmap.org/browse/node/44948760
If I search for Ellecom in the Netherlands: not found.

Searching for Emmerich am Rhein in Deutschland: not found.
Searching for Emmerich am Rhein in Nederland: found

Searching for a big place like Amersfoort, which is clearly on the map: not 
found

Searching for a streetname: found, but often located in the wrong place.

Cheers,
Minko
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Steve Ratcliffe

Hello


 Searching for a streetname: found, but often located in the wrong place.

It may or may not be the problem, but the fix that allowed tiles to be
uploaded to the device will have also in some cases had the effect of
making some streets appear in the wrong places.

In fact the fix was previously present and I removed it just for this
reason.  I'll will have to get both working together.

Once I have fix this problem, if there are still things that can not be 
found we will have to find a way of investigating if they made it
into the index and if not why they don't work.

But at the moment it is not 100% correct by what I know even.

..Steve
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread Minko
Hi Steve, 
The bug that some street names where placed under the wrong place names
already existed with the option road-name-pois (didn't use this option
for my last test). Sometimes it even happens if the node with the correct place 
name
is located closer than the wrong place name which is connected to this street.
other streets closeby can be connected to the correct place name.


See also
http://www.mail-archive.com/mkgmap-dev@lists.mkgmap.org.uk/msg06206.html


- Oorspronkelijk bericht -
Steve wrote:

 Searching for a streetname: found, but often located in the wrong place.

It may or may not be the problem, but the fix that allowed tiles to be
uploaded to the device will have also in some cases had the effect of
making some streets appear in the wrong places.
___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev


Re: [mkgmap-dev] Address search and index.

2011-02-14 Thread WanMil
Am 14.02.2011 13:20, schrieb Steve Hosgood:
 On 14/02/11 10:21, Steve Ratcliffe wrote:

 If there are faults in the data, they should be fixed.
 Well I get England, Great Britain, Great Britian, and United Kingdom.

 One is spelling mistake and so, fair enough, should be fixed and there
 is not going to be any argument from anyone about that.

 But the others are not wrong and might be fine in other situations.

 A few examples from England:

  k='is_in' v='Nantwich, Cheshire, England, United Kingdom'
  k='is_in' v='UK, England, County Durham, Teesdale'
  k='is_in' v='England, Essex'


 It's really an issue to be debated at OSM level, not mkgmap level, but I
 have considered for quite a while now that the is_in tag should be
 entirely deprecated in favour of a concept of boundary polygons. Is_in
 is fraught with problems, some illustrated above.

 The country of England should have a polygon around its border with the
 tag country=England and therefore could also be tagged with
 country:cy=Lloegr (and other language specifics). Any item within that
 polygon would automatically considered is_in=England (and Lloegr to
 allow for searching in Welsh). Within the England polygon would be a set
 of non-intersecting polygons each tagged county=wherever and again
 some of them might have foreign language variants - which would be fine.
 Any item within those polygons would automatically considered
 is_in=Cumbria or wherever (plus foreign-language variants of county
 names where they exist). The nesting would continue right down to hamlets.

 Notice how quickly you could fix a mistake - no need to trawl through
 millions of is_in tags looking for inconsistencies, spelling mistakes
 etc. Just fix the correct enclosing polygon.

 There is a war memorial in England (not far from London) which is
 officially in the USA! I think this is called an enclave... but even
 an enclave situation could be handled by polygons. Why should there not
 be a polygon around that site claiming country=United States??

 The interesting question from a polygon-parsing point of view is whether
 you'd need to establish a fixed heirarchy of tags each with levels
 assumed so that if you encountered country= within a county=
 polygon, that the county= would be forgotten about within the inner
 country= polygon. This would make sense - if that war memorial was
 in the USA, it can hardly also be in Middlesex (or wherever) which
 is a county of England. But see below for an alternative system...

 Back to England again. If England is a country, what is the UK
 deemed to be? In some ways the country should be UK, and England,
 Wales, Scotland and Northern Ireland are states. But that's not
 how any resident of the UK would see it. It's just down to semantics.

 You could get rid of this what's it called argument by requiring the
 outer polygon to be tagged political:level=1
 political:designation=country name=United Kingdom alt-name=UK.
 Further in you'd get a polygon tagged political:level=5
 political:designation=county name=Kent and eventually (inside that)
 maybe a polygon tagged political:level=6 political:designation=town
 name=Canterbury. This removes the need for a fixed known hierarchy of
 polygons (mentioned above when disussing the handling of enclaves) - if
 you encountered a polygon with a given political:level= it would,
 within its bounds,  cancel any supposed enclosing polygon with an equal
 or higher numerical political:level= tag.

 It also allows an easy way to handle the cases where some cities in the
 UK are considered to be counties in their own right (Swansea for
 instance - it is inside a polygon of political:level=5
 political:designation=county name=South Glamorganshire but the city
 itself could be polygonned as political:level=5
 political:designation=city and county name=Swansea name:cy=Abertawe.

 In order to implement anything like this, we'd need a way to work out
 quickly for any point on a map what the enclosing hierarchy of polygons
 would be. That in turn probably means having to implement knowledge of
 it in the splitter so that a small part of the planetary map still has
 intact (but truncated) nested polygons.

 PS: I guess Great Britain would be handled by a polygon around just
 the right-hand island of the British Isles (and its sub-islands)
 claiming geographical:level=1 name=Great Britain name:de=Groß
 Britannien name:fr=Grande Bretagne etc.

 Mkgmap would probably not be interested in geographical:* tags.


Steve, that was my first idea and I think that's the only solution to 
the problem. But it's not that easy to realize.

I propose two solutions.
1. Quick fix
Use the Locator.xml to merge different notations of the same country / 
region name. This won't be perfect but will probably fix the most 
obvious problems for a first release of the index branch.

For this purpose we need a good source to initially create a full 
release of the Locator.xml. For this we could use osmosis filtering a 

Re: [mkgmap-dev] Address search and index.

2011-02-13 Thread Clinton Gladstone
On Feb 13, 2011, at 23:52, Steve Ratcliffe wrote:
 
 So to make address search really useful the country names have to be cleaned 
 up, (other countries may be in a better state).
 
 Does anyone have any good ideas about this?

I don't have any good ideas yet, but I can report that Germany is in a similar 
state. I get the following countries listed:

Berlin
Bundesrepubblik Deutschland
Bundesrepublic Deutschland
Deutschland

It's going to be hard keeping this under control.



___
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev