Karl Newman wrote:
On Tue, Jun 23, 2009 at 6:11 PM, Brett Henderson <br...@bretth.com <mailto:br...@bretth.com>> wrote:

    Karl Newman wrote:


        What's happening there is that the node-key-value and
        way-key-value are ANDed together (which would leave you with
        only ways which match your tags and are composed of nodes
        tagged place=city), and you want an OR instead. You were sort
        of on the right track with the pipes, but what you need to do
        is use the "tee" function and apply the node-key-value filter
        to one leg of the tee, and apply the way-key-value filter to
        the other leg of the tee, then use the "merge" function to
        join the results. It would look something like this:

        ./osmosis-0.31/bin/osmosis --read-xml file="planet.bz2" --tee
        outputCount=2 outPipe.0="nodes" outPipe.1="ways"
        --node-key-value keyValueList="place.city" inPipe.0="nodes"
        --way-key-value
        
keyValueList="highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link"
        inPipe.0="ways" --merge --write-xml file="basemap.osm"

        I'm not sure if that will work exactly as written. You may
        need to add outPipe arguments to the node-key-value and
        way-key-value filters and then reference them as inPipe.0 and
        inPipe.1 arguments to the merge task.

    You may trigger a deadlock in this situation ... I've been waiting
    for somebody to try this out for a long time :-)

    While it's possible to construct a pipeline that tees a single
    dataset into multiple streams before merging them back together
    again, it is problematic from thread synchronisation point of view
    because the same input thread is feeding two inputs of another
    thread.  Using a --buffer task within both paths of the branch may
    help because it de-couples the threads somewhat with a buffer.

    The --read-xml task creates a thread which passes data into the
    --tee task.  The --tee task doesn't create a thread, it just uses
    the existing thread to pass incoming data to all consumers.  The
    --node-key-value and --way-key-value also use the existing thread
    to write to their destination which in both cases is the --merge
    task.  The --merge task creates a new thread which reads the
    incoming data from both of its inputs, but both inputs are coming
    from a single thread (ie. the original --read-xml thread).  The
    --merge thread may read from one input, then start waiting for a
    specific value on the the other input and never receive it.

    But if it works let me know.  I'm curious :-)

    Brett

Hmm... I didn't look at the code too closely. I thought the tee created separate threads. I'm trying to see what might cause a deadlock--it looks like it would happen in DataPostBox if anywhere--but it's not obvious what might trigger it. I guess what could happen is if the merge task is trying to get entities from both pipelines to compare them, and there isn't anything available in one of the pipelines, that might deadlock it. I guess someone will have to try it and see!
Yep, DataPostBox is the point at which threads will deadlock. I think it is the only class in the whole of osmosis that performs any thread coordination ...

Thinking further, it is almost guaranteed to deadlock in this scenario. The merge task requires data from both inputs to perform a comparison. The two --node-key-value task will produce data while nodes are available on input, and the --way-key-value task will produce data while ways are available on input. So at the node data has filled the DataPostbox for one input of the merge task, there will still be no way data available on the second input of the merge task and therefore the merge task will block. The original --read-xml task thread will then block because it is waiting for the full buffer of the first merge task input DataPostbox to clear.

Brett

_______________________________________________
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev

Reply via email to