Karl Newman wrote:
On Tue, Jun 23, 2009 at 6:11 PM, Brett Henderson <br...@bretth.com
<mailto:br...@bretth.com>> wrote:
Karl Newman wrote:
What's happening there is that the node-key-value and
way-key-value are ANDed together (which would leave you with
only ways which match your tags and are composed of nodes
tagged place=city), and you want an OR instead. You were sort
of on the right track with the pipes, but what you need to do
is use the "tee" function and apply the node-key-value filter
to one leg of the tee, and apply the way-key-value filter to
the other leg of the tee, then use the "merge" function to
join the results. It would look something like this:
./osmosis-0.31/bin/osmosis --read-xml file="planet.bz2" --tee
outputCount=2 outPipe.0="nodes" outPipe.1="ways"
--node-key-value keyValueList="place.city" inPipe.0="nodes"
--way-key-value
keyValueList="highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link"
inPipe.0="ways" --merge --write-xml file="basemap.osm"
I'm not sure if that will work exactly as written. You may
need to add outPipe arguments to the node-key-value and
way-key-value filters and then reference them as inPipe.0 and
inPipe.1 arguments to the merge task.
You may trigger a deadlock in this situation ... I've been waiting
for somebody to try this out for a long time :-)
While it's possible to construct a pipeline that tees a single
dataset into multiple streams before merging them back together
again, it is problematic from thread synchronisation point of view
because the same input thread is feeding two inputs of another
thread. Using a --buffer task within both paths of the branch may
help because it de-couples the threads somewhat with a buffer.
The --read-xml task creates a thread which passes data into the
--tee task. The --tee task doesn't create a thread, it just uses
the existing thread to pass incoming data to all consumers. The
--node-key-value and --way-key-value also use the existing thread
to write to their destination which in both cases is the --merge
task. The --merge task creates a new thread which reads the
incoming data from both of its inputs, but both inputs are coming
from a single thread (ie. the original --read-xml thread). The
--merge thread may read from one input, then start waiting for a
specific value on the the other input and never receive it.
But if it works let me know. I'm curious :-)
Brett
Hmm... I didn't look at the code too closely. I thought the tee
created separate threads. I'm trying to see what might cause a
deadlock--it looks like it would happen in DataPostBox if
anywhere--but it's not obvious what might trigger it. I guess what
could happen is if the merge task is trying to get entities from both
pipelines to compare them, and there isn't anything available in one
of the pipelines, that might deadlock it. I guess someone will have to
try it and see!
Yep, DataPostBox is the point at which threads will deadlock. I think
it is the only class in the whole of osmosis that performs any thread
coordination ...
Thinking further, it is almost guaranteed to deadlock in this scenario.
The merge task requires data from both inputs to perform a comparison.
The two --node-key-value task will produce data while nodes are
available on input, and the --way-key-value task will produce data while
ways are available on input. So at the node data has filled the
DataPostbox for one input of the merge task, there will still be no way
data available on the second input of the merge task and therefore the
merge task will block. The original --read-xml task thread will then
block because it is waiting for the full buffer of the first merge task
input DataPostbox to clear.
Brett
_______________________________________________
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev