On 12-Oct-12, at 3:50 PM, Iván Sánchez Ortega wrote:
Also.
ogr3osm.
I would love to have the time and resources (or paid time,
nudgenudgewinkwink)
to redo ogr2osm; adding a backtracking-like algorithm to minimise
the amount
of geometries' shared nodes (and their bounding boxes) in memory,
in order to
be able to convert datasets with gazillions of geometries into .osm
format.
Backtracking-like in the sense that the data processing would be
done in a
tree-like fashion, walking through overlapping geometries,
processing only
geometries which have all their nodes already into the list of
generated node
IDs, writing to file and destroying from memory nodes that won't
appear again
because all the overlapping geometries have been processed.
Yes, it sounds like a mouthful. I have a bunch of napkin notes with
the
algorithm written down, though :-)
I hate to steal your consulting work but I already redid the node
merging in ogr2osm :)
I have it check if there is an existing node on its list before
creating one - this turns out to be way quicker than checking after
the fact for nodes in common and removing one of them.
I haven't profiled it in awhile but the slowest step is now writing
the XML out with SimpleXMLWriter. I intend to evaluate switching to a
different library to get more performance. It didn't appear to be
disk bound, but spending all of its time in SimpleXMLWriter.
I'm giving a talk tomorrow on ogr2osm and might expand what I'm
saying about the node merging.
I ran some statistics on my in-progress NHD translation. This is a
fairly complex translation with involved logic, but it does drop a
few smaller layers. For a 600 MB .mdb (400 MB .shp) resultng in a 540
MB .osm it takes 12 minutes on my home server. It's CPU bound and
single-threaded. I think it uses about 6-7 gigs of ram for that. I
may be able to get that down substantially, I haven't really attacked
the RAM usage yet.
_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev