Re: Thread Safety of POSTaggerME

Jörn Kottmann Sat, 05 Mar 2011 05:25:08 -0800

On 3/5/11 1:49 PM, Grant Ingersoll wrote:

On Feb 22, 2011, at 11:25 AM, Jörn Kottmann wrote:

On 2/22/11 4:55 PM, Grant Ingersoll wrote:

Hi,

I'm using 1.4.3, but it looks like trunk has the same issue.  That is, it 
doesn't appear like the POSTaggerME class is thread safe, but perhaps I am 
misreading it.  I ask this, because it seems like the capturing of the 
bestSequence instance is a member variable and the tag and probs methods both 
access this method.  The reason I ask, is b/c I want to use this inside of 
Solr, but that is multithreaded and could be serving up a lot of requests and I 
certainly can't afford to load the model for each request.  The fix for this 
particular class seems relatively straightforward, at the cost of breaking back 
compatibility of the API (which is a whole other topic)

I haven't looked deeper, but are there any other classes that I should be aware 
of w/ thread safety that people can think of?

The components are not thread safe. They must only be called from one thread.
How to run OpenNLP in multiple threads then?

The models are thread-safe (because they are immutable) and can be shared
between multiple instances of the same component (not strictly immutable, so 
make sure
to publish them correctly). Just create one instance per thread and share the
model instance.  In your case I guess you can just use ThreadLocal to maintain 
one instance
per thread combined with lazy initialization.

This way we are lock free and avoid difficult to understand
and test multi-threading code. Making sure that our models are immutable is easy
and even if we make a mistake there it is unlikely that a user changes the model
in an application like yours. In the end I believe that we found a really 
simple, solid
and nice solution for this problem.


In this particular case, we could still be lock free and thread safe.  All we 
would need to do is to return out the best sequence instead of storing it in 
the object.

ThreadLocal's are not a great way of handling this stuff, IMO.  I also wonder 
how lightweight it is to create the objects that wrap the models.

Actually I do not understand how that would be, can you please elaboratea little here?


Jörn

Re: Thread Safety of POSTaggerME

Reply via email to