On 12-Oct-12, at 3:50 PM, Iván Sánchez Ortega wrote:


Also.

ogr3osm.

I would love to have the time and resources (or paid time, nudgenudgewinkwink) to redo ogr2osm; adding a backtracking-like algorithm to minimise the amount of geometries' shared nodes (and their bounding boxes) in memory, in order to be able to convert datasets with gazillions of geometries into .osm format. Backtracking-like in the sense that the data processing would be done in a tree-like fashion, walking through overlapping geometries, processing only geometries which have all their nodes already into the list of generated node IDs, writing to file and destroying from memory nodes that won't appear again
because all the overlapping geometries have been processed.

Yes, it sounds like a mouthful. I have a bunch of napkin notes with the
algorithm written down, though :-)

I hate to steal your consulting work but I already redid the node merging in ogr2osm :)

I have it check if there is an existing node on its list before creating one - this turns out to be way quicker than checking after the fact for nodes in common and removing one of them.

I haven't profiled it in awhile but the slowest step is now writing the XML out with SimpleXMLWriter. I intend to evaluate switching to a different library to get more performance. It didn't appear to be disk bound, but spending all of its time in SimpleXMLWriter.

I'm giving a talk tomorrow on ogr2osm and might expand what I'm saying about the node merging.

I ran some statistics on my in-progress NHD translation. This is a fairly complex translation with involved logic, but it does drop a few smaller layers. For a 600 MB .mdb (400 MB .shp) resultng in a 540 MB .osm it takes 12 minutes on my home server. It's CPU bound and single-threaded. I think it uses about 6-7 gigs of ram for that. I may be able to get that down substantially, I haven't really attacked the RAM usage yet.



_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to