On Fri, Jan 22, 2010 at 5:17 AM, WanMil wmgc...@web.de wrote:
Hi,
Apollinaris Schoell wrote:
osmosis supports 2 options to keep ways and relations intact. but
geofabrik extracts don't use it as far as I know. completeWays
completeRelations
That is correct: These options exist, and are not used, because if we'd
use them, the nightly build would take something like three days instead
of six hours. Osmosis works very well in streaming mode and these
options make it impossible to stream - Osmosis needs to create a
temporary copy of the full dataset and is rather inefficient at it.
We typically do something like
osmosis --rx file.osm --tee 20 --bp file=country1.poly --wx country1.osm
--bp file=country2.poly --wx country2.osm...
which would causes Osmosis to make 20 temporary full copies of the data
and write them to disk in completeWays mode.
We use clipIncompleteEntities which means that Osmosis removes
references to those nodes outside the polygon from any ways or relations.
It would be good if Osmosis could somehow flag the clipped entities so
that processing software could at least know that there is something
wrong, or incomplete, with them.
Adding the actual polygon used for clipping could of course be done but
it will not automatically enable proper filling. Assume this:
|
+-|+-+
| | |
| | |--- filled area
+-|+-+
|
|-- clipping boundary
After clipping with clipIncompleteEntities, this will lead to
|
| +-+
| | |
| | |
| +-+
|
|
Even if you know where the clipping boundary is, you cannot extend the
object towards that boundary properly because you are missing the nodes
beyond the boundary.
Bye
Frederik
Ok, we cannot reconstruct the 100% exact original shape if the first
node outside the boundary is missing. But a reasonable workaround is to
connect directly to the boundary in case there are nodes missing in a
polygon and the last point is not too far away from the boundary. I
think in most cases this will be ok.
This is not the 100% solution and it would be better if ways and
relations are added completely or at least the first point beyond
boundary should be clear. But as long as this is too time-consuming the
proposed solution to add the boundary to the OSM dump is an easy to
realize and very helpful improvement. Flagging of clipped ways is
another good and helpful improvement.
Hi WanMil,
Frederik has already given you most useful info, but I'll add some comments
to the discussion in case you or others find it useful.
ADDING GEOMETRY DETAILS TO OUTPUT FILE
It is not simple to add additional geometry information to an output OSM
file due to the way Osmosis works internally. The task that extracts
boundary data (eg. --bounding-polygon) is independent of the --write-xml
task producing the output file which means that the geometry information
cannot be written to the output file. This is an intentional design design
in the Osmosis design where each task can be implemented independently of
all other tasks.
The advantage of this approach is that tasks can be combined in various
combinations according to what the end user requires. The disadvantage is
that there is no simple way of passing additional information between tasks
other than standard node/way/relation data. Osmosis is a generic tool
supporting many use cases, which means it can't always provide the ideal
solution.
It is not impossible, but requires a fair bit of rework to implement.
Simple bounding box information can be propagated through the pipeline so I
believe the --bounding-box will cause a Bound element to be added to the
output file. But more complex geometries such as those used by
--bounding-polygon are not supported. This would need to be enhanced if we
wanted to pass bounding box information.
I don't have a lot of time for Osmosis so I can't tackle this one.
CLIPPED WAYS
Flagging clipped ways would be quite useful. It requires enhancing the
Entity class within Osmosis to allow additional information to be attached.
It also requires all existing tasks to be enhanced where necessary to
manipulate this additional data. Again, I'm not likely to do this myself,
but would support anybody trying to implement it themselves.
COMPLETE GEOMETRIES
As you've discovered, you can end up with missing data for ways or relations
that sit outside the bounding box. For example, nodes that sit outside the
area may be part of a way inside the area but will be left out.
The --bounding- tasks are unlikely to ever be fixed in this regard. As
Frederik points out there is just no way to efficiently do this in a streamy
fashion because it requires random access to the entire dataset which isn't
something Osmosis is good at. By the time you process ways, the nodes are
not available in memory any more so you can't tell where each way is
located. You can't go back and add extra nodes to the