Hi Patrick, the TIGER data is full of wrong intervals. Sometimes even numbers are combined with addr:interpolation=odd, sometimes they produce duplicate numbers, e.g. when one goes from 1500..1599 and another one from 1500..1740. Those wrong intervals caused a lot of loops and also the error. With r4260 the performance should be better, but I did not yet work on a better detection of wrong data. If you want to get an impresssion I suggest to enable logging with uk.me.parabola.mkgmap.osmstyle.housenumber.level=INFO for a single input file like that in this thread. See https://wiki.openstreetmap.org/wiki/Mkgmap/dev#Enabling_Debugging for details.
You will find messages like these in the log: INFO: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberGenerator f:\dwnload\temp\test.osm: conflict caused by addr:interpolation way 107th Street West http://www.openstreetmap.org/way/-1960472598 40274..40382, step=2 and address element 40298(13) at 34.614524,-118.319053 WARN: uk.me.parabola.mkgmap.osmstyle.housenumber.HousenumberGenerator f:\dwnload\temp\test.osm: addr:interpolation way 107th Street West http://www.openstreetmap.org/way/-1960472598 40274..40382, step=2 is ignored, it produces 1 duplicate number(s) too far from existing nodes Gerd ________________________________________ Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag von Patrick Simmons <linuxrocks...@netscape.net> Gesendet: Freitag, 28. Dezember 2018 07:49 An: mkgmap-dev@lists.mkgmap.org.uk Betreff: Re: [mkgmap-dev] mkgmap crashing with non-OSM data Gerd, Thanks for getting back to me. And, btw, I'm very sorry about the quadruple-post. I wrote and use my own email client and it crashed upon sending my message to the list, and for some reason I'm not getting copies of my own posts, so I thought it hadn't gone through. Then I checked the archive and ... oops. Re 1: TIGER is a product of the US federal government, so it is public domain: no license is needed to use it in any way for any purpose. Re 2: I'd be interested to know what happens in places where the TIGER data conflicts with OSM. I agree it would suck to erase the contributions of OSM mappers and would like to avoid that if at all possible. Ideally in the case of a conflict you'd just get two items in the search results very close to each other and could pick the better one. We can check what's happening pretty easily if you know of a place in the US where OSM has a street address that mkgmap-created maps normally index and that differs from what's in TIGER: just send me the address, and I'll search for it on my device loaded with my shiny new maps and see what I get. Re performance issue: for the whole US, it was taking about 48 hours using 3 threads on an i5-4460S, and about 3.33GB of RAM per thread. I had to limit the number of threads used to three instead of four so that it wouldn't overflow the Java heap with -Xmx10000M, which was all the memory I had. The first time I tried to make the maps (about 2 weeks ago now), I did some rudimentary profiling to make sure it wasn't infinite looping, and I seem to recall the place where it was taking a long time was in ExtNumbers.java in the for loop on lines 1135-1146. My guess would be the problem would more likely be due to the added volume of data than the mixture of the data. My script should be generating XML for parallel street address ways that is similar to how street numbers might exist in normal OSM data, but it is generating 50GB uncompressed of them. You can download http://moongate.ydns.eu/tiger_versus_python/tiger_all.osc.xz if you'd like to take a look at it, but please wait about 3 hours after I send this email since my computer is currently generating and uploading an updated version of that file. --Patrick On Thu, 27 Dec 2018 22:26:19 -0700 (MST), Gerd Petermann <gpetermann_muenc...@hotmail.com> wrote: > Hi Partrick, > > thanks for reporting, I can reproduce the problem and I'll try to fix it. > Two remarks: > 1) Please make sure that the TIGER licence allows to do this mixing of data > 2) Please note that TIGER data is not really a good source for addresses and > the mixture of OSM data and TIGER data are likely to decrease the quality in > those places where they differ > > The data shows a performance problem in mkgmap (probably caused by this > mixture), it takes very long to calculate the address data. > > Gerd > > > > -- > Sent from: http://gis.19327.n8.nabble.com/Mkgmap-Development-f5324443.html > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev -- MailTask: The Email Manager https://github.com/linuxrocks123/MailTask GPLv3 software, beta maturity _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev