On 3/5/11 1:49 PM, Grant Ingersoll wrote:
On Feb 22, 2011, at 11:25 AM, Jörn Kottmann wrote:
On 2/22/11 4:55 PM, Grant Ingersoll wrote:
Hi,
I'm using 1.4.3, but it looks like trunk has the same issue. That is, it
doesn't appear like the POSTaggerME class is thread safe, but perhaps I am
misreading it. I ask this, because it seems like the capturing of the
bestSequence instance is a member variable and the tag and probs methods both
access this method. The reason I ask, is b/c I want to use this inside of
Solr, but that is multithreaded and could be serving up a lot of requests and I
certainly can't afford to load the model for each request. The fix for this
particular class seems relatively straightforward, at the cost of breaking back
compatibility of the API (which is a whole other topic)
I haven't looked deeper, but are there any other classes that I should be aware
of w/ thread safety that people can think of?
The components are not thread safe. They must only be called from one thread.
How to run OpenNLP in multiple threads then?
The models are thread-safe (because they are immutable) and can be shared
between multiple instances of the same component (not strictly immutable, so
make sure
to publish them correctly). Just create one instance per thread and share the
model instance. In your case I guess you can just use ThreadLocal to maintain
one instance
per thread combined with lazy initialization.
This way we are lock free and avoid difficult to understand
and test multi-threading code. Making sure that our models are immutable is easy
and even if we make a mistake there it is unlikely that a user changes the model
in an application like yours. In the end I believe that we found a really
simple, solid
and nice solution for this problem.
In this particular case, we could still be lock free and thread safe. All we
would need to do is to return out the best sequence instead of storing it in
the object.
ThreadLocal's are not a great way of handling this stuff, IMO. I also wonder
how lightweight it is to create the objects that wrap the models.
Actually I do not understand how that would be, can you please elaborate
a little here?
Jörn