Re: [osmosis-dev] question about --merge

2010-05-19 Thread Andrew Byrd
Hello,

On 19 May 2010, at 13:42, Brett Henderson wrote:
> The --tag-filter task is about the most comprehensive task currently  
> available.  To add a full boolean expression language is certainly  
> possible, but not trivial.  If somebody wants to take on the  
> challenge of creating a complex yet generic tag analysis task then  
> I'll do my best to help them out.  I'll be honest though, it's not  
> likely to be something I'll get around to implementing myself.

Part of the challenge is that tags and values can contain any  
character, so your expressions get cluttered with escape sequences,  
and also must pass through the shell.

Another issue is running a used-node filter that would 'know' not to  
delete the non-way nodes that you want. It's pretty complicated to  
unambiguously specify this kind of behaviour, unless the used-node  
filter is integrated into the tagfilter.

I'm still open to tagfilter syntax and functionality suggestions. You  
can do many different things with the current version, but you have to  
spell it all out explicitly on the command line, which I don't  
necessarily consider a bad thing.

> Thinking about your case a bit more, you could avoid the temp file  
> if you simply read the input file twice so you avoid the --tee.   
> Although that might be slower than creating a temp file and invoking  
> Osmosis twice.  If it was me I'd simply create a wrapper shell  
> script to tie several osmosis commands together and do some checking  
> of the Osmosis return code to ensure it hasn't failed at each step.   
> The --buffer approach might work, but the memory consumption is  
> likely to bite you at some point.

This is the solution I came up with too. I didn't notice any  
significant slowdown with this approach on a city-sized OSM file. I'm  
re-including the example command I gave before, because I messed up a  
parameter the last time:

./osmosis/bin/osmosis \
--rx input.osm \
--tf reject-relations \
--tf accept-nodes amenity=* \
--tf reject-ways outPipe.0=POI \
\
--rx input.osm \
--tf reject-relations \
--tf accept-ways highway=motorway \
--used-node outPipe.0=motorway \
\
--merge inPipe.0=POI inPipe.1=motorway \
--wx test-merge.osm

The input file must be sorted for the merge to work right (it should  
be sorted if it comes from planet.osm, if not you can add a sort task.)

-Andrew Byrd

___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] question about --merge

2010-05-19 Thread Brett Henderson
On Wed, May 19, 2010 at 4:29 AM, Christoph Wagner <
freemaps@googlemail.com> wrote:

> Am 18.05.2010 15:18, schrieb Brett Henderson:
>
> > To summarise, if you can draw a graph of the data flows in Osmosis,
> ensure
> > that you never have data being split then recombined at a later point.
> >
> > In your case the only option is to split it into two steps using
> temporary
> > XML files as you already appear to be doing.
> >
> > It would be nice if Osmosis could detect this situation and throw an
> error,
> > but it would take a lot of effort and add a lot of complexity that I
> don't
> > really wish to attempt.
>
>
>
> Ok, thanks for your reply. I think now I understand the situation better.
>
> While reading your answer I got an idea to solve the problem without
> writing to disk.
>
> I use a very big buffer now:
>
> osmosis-0.35.1/bin/osmosis --rx input.osm --t outPipe.1=points \
> --wk keyList="addr:interpolation,addr:housenumber" --un --s --b 1 \
> --nk inPipe.0=points keyList="addr:housenumber" --s --b 1 \
> --m --wx addr.osm
>
> This works very well if you have enough RAM and set the bufferCapacity high
> enough.
>

Yes, that will also work if you're prepared to experiment a bit with buffer
sizes.


> Another possibility would be a better tagfilter that allows me to filter
> POIs and ways at once.
> The task would be for example: "Give me all ways that match this way-filter
> and all nodes that match this node-filter or are part of the way that
> matches the way-filter"
>

The --tag-filter task is about the most comprehensive task currently
available.  To add a full boolean expression language is certainly possible,
but not trivial.  If somebody wants to take on the challenge of creating a
complex yet generic tag analysis task then I'll do my best to help them
out.  I'll be honest though, it's not likely to be something I'll get around
to implementing myself.

Thinking about your case a bit more, you could avoid the temp file if you
simply read the input file twice so you avoid the --tee.  Although that
might be slower than creating a temp file and invoking Osmosis twice.  If it
was me I'd simply create a wrapper shell script to tie several osmosis
commands together and do some checking of the Osmosis return code to ensure
it hasn't failed at each step.  The --buffer approach might work, but the
memory consumption is likely to bite you at some point.

Brett
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] question about --merge

2010-05-18 Thread Christoph Wagner
Am 18.05.2010 15:18, schrieb Brett Henderson:

> To summarise, if you can draw a graph of the data flows in Osmosis, ensure
> that you never have data being split then recombined at a later point.
> 
> In your case the only option is to split it into two steps using temporary
> XML files as you already appear to be doing.
> 
> It would be nice if Osmosis could detect this situation and throw an error,
> but it would take a lot of effort and add a lot of complexity that I don't
> really wish to attempt.



Ok, thanks for your reply. I think now I understand the situation better.

While reading your answer I got an idea to solve the problem without writing to 
disk.

I use a very big buffer now:

osmosis-0.35.1/bin/osmosis --rx input.osm --t outPipe.1=points \
--wk keyList="addr:interpolation,addr:housenumber" --un --s --b 1 \
--nk inPipe.0=points keyList="addr:housenumber" --s --b 1 \
--m --wx addr.osm

This works very well if you have enough RAM and set the bufferCapacity high 
enough.


Another possibility would be a better tagfilter that allows me to filter POIs 
and ways at once.
The task would be for example: "Give me all ways that match this way-filter and 
all nodes that match this node-filter or are part of the way that matches the 
way-filter"

I don't know if this is difficult to implement but I think this would be 
usefull.

Thanks
Christoph



signature.asc
Description: OpenPGP digital signature
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] question about --merge

2010-05-18 Thread Brett Henderson
On Tue, May 18, 2010 at 1:42 AM, Christoph Wagner <
freemaps@googlemail.com> wrote:

> Hello list,
>
> I am new to this dev-list. My name is Christoph Wagner and I maintain the
> Garmin All in one Map:
> http://wiki.openstreetmap.org/wiki/All_in_one_Garmin_Map
>
> I want to reduce some server load (I omit details here) and tried to filter
> OSM-Data with osmosis.
>
> For example I have the problem to filter out all kind of adresses in an
> osm-file.
> My first guess was that:
>
> osmosis-0.35.1/bin/osmosis --rx input.osm --t outPipe.1=points \
> --wk keyList="addr:interpolation,addr:housenumber" --un --s \
> --nk inPipe.0=points keyList="addr:housenumber" --s \
> --m --wx addr.osm
>
> The Problem is, that osmosis never finishes. I wonder why.
> No error, no writing, no CPU-usage - it just waits and does nothing.
> There must be something wrong with the --m I guess, but I don't know what.
>
> This one here does what I want, but needs to write out and read in again
> xml files for that:
>
> osmosis-0.35.1/bin/osmosis --rx input.osm --t outPipe.1=points \
> --wk keyList="addr:interpolation,addr:housenumber" --un --s --wx
> addr_lines.osm \
> --nk inPipe.0=points keyList="addr:housenumber" --s --wx addr_points.osm
>
> osmosis-0.35.1/bin/osmosis --rx addr_lines.osm --rx addr_points.osm --m
> --wx addr.osm
>
> Is this behaviour wanted? Can somebody give me a hint to do it better?
> I have no idea what is wrong.
>

Hi Christoph, you are experiencing deadlock between threads in the Osmosis
pipeline.

You can't split data into two pipes, then combine it back together again in
a later task.  The tee task passes all input data to both output pipes.  If
one of those output pipes is blocked, the entire tee task will stop until
the pipe clears.  Several tasks downstream from the tee task is the merge
task which requires both input pipes to have reached the same entity in
order to perform comparisons.  It will block reading of the pipe with newer
data until the pipe with older data catches up.  In between the tee and
merge tasks are tasks that may be discarding data or waiting for special
events before releasing data to the next task along.  The merge task is
blocked waiting for data that will never be sent because the tee task is
also blocked waiting for the merge task to read.  I may not be making much
sense here, but the end result is that the merge task may stop reading one
input pipe waiting for data from another that never comes because the tee
task is blocked on writing to the pipe that the merge task has stopped
reading.

To summarise, if you can draw a graph of the data flows in Osmosis, ensure
that you never have data being split then recombined at a later point.

In your case the only option is to split it into two steps using temporary
XML files as you already appear to be doing.

It would be nice if Osmosis could detect this situation and throw an error,
but it would take a lot of effort and add a lot of complexity that I don't
really wish to attempt.

Brett
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


[osmosis-dev] question about --merge

2010-05-17 Thread Christoph Wagner
Hello list,

I am new to this dev-list. My name is Christoph Wagner and I maintain the 
Garmin All in one Map:
http://wiki.openstreetmap.org/wiki/All_in_one_Garmin_Map

I want to reduce some server load (I omit details here) and tried to filter 
OSM-Data with osmosis.

For example I have the problem to filter out all kind of adresses in an 
osm-file.
My first guess was that:

osmosis-0.35.1/bin/osmosis --rx input.osm --t outPipe.1=points \
--wk keyList="addr:interpolation,addr:housenumber" --un --s \
--nk inPipe.0=points keyList="addr:housenumber" --s \
--m --wx addr.osm

The Problem is, that osmosis never finishes. I wonder why.
No error, no writing, no CPU-usage - it just waits and does nothing.
There must be something wrong with the --m I guess, but I don't know what.

This one here does what I want, but needs to write out and read in again xml 
files for that:

osmosis-0.35.1/bin/osmosis --rx input.osm --t outPipe.1=points \
--wk keyList="addr:interpolation,addr:housenumber" --un --s --wx addr_lines.osm 
\
--nk inPipe.0=points keyList="addr:housenumber" --s --wx addr_points.osm

osmosis-0.35.1/bin/osmosis --rx addr_lines.osm --rx addr_points.osm --m --wx 
addr.osm

Is this behaviour wanted? Can somebody give me a hint to do it better?
I have no idea what is wrong.

Thanks
Christoph



signature.asc
Description: OpenPGP digital signature
___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev