Hi Patrick,

the TIGER data is full of wrong intervals. Sometimes even numbers are combined 
with addr:interpolation=odd, sometimes they produce duplicate numbers,
e.g. when one goes from 1500..1599 and another one from 1500..1740.
Those wrong intervals caused a lot of loops and also the error. With r4260 the 
performance should be better, but I did not yet work on a better detection of 
wrong data.
If you want to get an impresssion I suggest to enable logging with
uk.me.parabola.mkgmap.osmstyle.housenumber.level=INFO
for a single input file like that in this thread.
See https://wiki.openstreetmap.org/wiki/Mkgmap/dev#Enabling_Debugging for 
details.

You will find messages like these in the log:
INFO: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberGenerator  
f:\dwnload\temp\test.osm: conflict caused by addr:interpolation way 107th 
Street West http://www.openstreetmap.org/way/-1960472598 40274..40382, step=2 
and address element 40298(13) at 34.614524,-118.319053
WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberGenerator  
f:\dwnload\temp\test.osm: addr:interpolation way 107th Street West 
http://www.openstreetmap.org/way/-1960472598 40274..40382, step=2 is ignored, 
it produces 1 duplicate number(s) too far from existing nodes


Gerd

________________________________________
Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag von Patrick 
Simmons <linuxrocks...@netscape.net>
Gesendet: Freitag, 28. Dezember 2018 07:49
An: mkgmap-dev@lists.mkgmap.org.uk
Betreff: Re: [mkgmap-dev] mkgmap crashing with non-OSM data


Gerd,

Thanks for getting back to me.  And, btw, I'm very sorry about the 
quadruple-post.  I wrote and use my own email client and it crashed upon 
sending my message to the list, and for some reason I'm not getting copies of 
my own posts, so I thought it hadn't gone through.  Then I checked the archive 
and ... oops.

Re 1: TIGER is a product of the US federal government, so it is public domain: 
no license is needed to use it in any way for any purpose.

Re 2: I'd be interested to know what happens in places where the TIGER data 
conflicts with OSM.  I agree it would suck to erase the contributions of OSM 
mappers and would like to avoid that if at all possible.  Ideally in the case 
of a conflict you'd just get two items in the search results very close to each 
other and could pick the better one.  We can check what's happening pretty 
easily if you know of a place in the US where OSM has a street address that 
mkgmap-created maps normally index and that differs from what's in TIGER: just 
send me the address, and I'll search for it on my device loaded with my shiny 
new maps and see what I get.

Re performance issue: for the whole US, it was taking about 48 hours using 3 
threads on an i5-4460S, and about 3.33GB of RAM per thread.  I had to limit the 
number of threads used to three instead of four so that it wouldn't overflow 
the Java heap with -Xmx10000M, which was all the memory I had.  The first time 
I tried to make the maps (about 2 weeks ago now), I did some rudimentary 
profiling to make sure it wasn't infinite looping, and I seem to recall the 
place where it was taking a long time was in ExtNumbers.java in the for loop on 
lines 1135-1146.

My guess would be the problem would more likely be due to the added volume of 
data than the mixture of the data.  My script should be generating XML for 
parallel street address ways that is similar to how street numbers might exist 
in normal OSM data, but it is generating 50GB uncompressed of them.  You can 
download http://moongate.ydns.eu/tiger_versus_python/tiger_all.osc.xz if you'd 
like to take a look at it, but please wait about 3 hours after I send this 
email since my computer is currently generating and uploading an updated 
version of that file.

--Patrick

On Thu, 27 Dec 2018 22:26:19 -0700 (MST), Gerd Petermann 
<gpetermann_muenc...@hotmail.com> wrote:
> Hi Partrick,
>
> thanks for reporting, I can reproduce the problem and I'll try to fix it.
> Two remarks:
> 1) Please make sure that the TIGER licence allows to do this mixing of data
> 2) Please note that TIGER data is not really a good source for addresses and
> the mixture of OSM data and TIGER data are likely to decrease the quality in
> those places where they differ
>
> The data shows a performance problem in mkgmap (probably caused by this
> mixture), it takes very long to calculate the address data.
>
> Gerd
>
>
>
> --
> Sent from: http://gis.19327.n8.nabble.com/Mkgmap-Development-f5324443.html
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
--
MailTask: The Email Manager
https://github.com/linuxrocks123/MailTask
GPLv3 software, beta maturity
_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to