[osmosis-dev] How to use EntityBuffer

2009-04-20 Thread marcus.wolschon

Hello,

I am trying to use the EntityBuffer -class but seem to be doing something
wrong here.

I did:

XMReader task = new ...
EntityBuffer  buffer = new EntityBuffer(BUFFERCAPACITY);

task.setSink(buffer);
buffer.setSink(sink);

buffer.run();
task.run();


It seems that this hangs indefinately waiting for entities to
come in from the buffer.
Do I need to call some special method or use a special
manager-class to make the buffer start it's Executor or Thread?

Marcus

___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev


Re: [osmosis-dev] How to use EntityBuffer

2009-04-20 Thread Brett Henderson
marcus.wolsc...@googlemail.com wrote:
 Hello,

 I am trying to use the EntityBuffer -class but seem to be doing something
 wrong here.

 I did:

 XMReader task = new ...
 EntityBuffer  buffer = new EntityBuffer(BUFFERCAPACITY);

 task.setSink(buffer);
 buffer.setSink(sink);

 buffer.run();
 task.run();


 It seems that this hangs indefinately waiting for entities to
 come in from the buffer.
 Do I need to call some special method or use a special
 manager-class to make the buffer start it's Executor or Thread?
   
The key reason for using EntityBuffer is to separating processing into 
multiple threads.  It is a key class implementing the --buffer task 
whose main reason for existing is to spread the pipeline execution 
across multiple threads and therefore CPU cores.

So, the reason you're having a problem is because the buffer.run() 
method must be run in a separate thread to the task feeding data into 
it.  The input thread calling the process method will block when the 
buffer becomes full.  Likewise, the buffer.run thread will block until 
data becomes available.  If you call both from the same thread the first 
one will never complete.

Internally there are actually three buffers.  The first is 
unsynchronised and allows the input thread to accumulate a number of 
entities before gaining the main lock.  The second is synchronised and 
is shared between the two threads.  The third is unsynchronised and 
allows the output thread to retrieve a number of records from the second 
buffer reducing the number of times the lock has to be obtained.

If all you want is a way of buffering data within a single thread then 
EntityBuffer isn't what you're looking for but I've never found a case 
where buffering without an additional thread improves performance, I've 
always had more like using file buffers, database transactions, 
multi-row inserts, etc to achieve that.

There is a class called TaskRunner which is a Thread class that allows 
you to detect when errors occur within the thread, it might be useful 
for you.  If you want an example of its use check out the 
ActiveTaskManager which is used to run all tasks requiring their own 
thread (ie. all reading tasks and some others), and the ChangeDownloader 
task which is used for concurrently reading and merging many files into 
a single change stream.

Brett


___
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev