Re: Thread-safe versions of some of the tools

Thilo Goetz Thu, 12 Jan 2017 00:49:46 -0800

On 11/01/2017 22:51, Joern Kottmann wrote:

On Wed, 2017-01-11 at 11:05 +0100, Thilo Goetz wrote:

in a recent project, I was using SentenceDetectorME, TokenizerME and
POSTaggerME. It turns out that none of those is thread safe. This is
because the classification probabilities for the last tag() call
(for
example) are stored in a member variable and can be retrieved by a
separate API call.

The POSTagger already has the Sequence object to return the result
with probabilties. If we would introduce a new method we can probably
just deprecate the method to retrieve the probs.


Should be a minor change to have an interface that can be thread safe.

[...]

I don't want to muddy the waters, but I had another idea: we could alsoadd a getThreadLocal() method to the tools we want. You would create aPOSTaggerME (for example) like always, and if you needed a per threadversion, you could then call getThreadLocal(), which would give youanother POSTaggerME with the same model, per thread. The advantage as Isee it is that the API extension would be conservative (just one methodadded), and getting the probabilities would continue to work as beforebecause you have one instance per thread.

Does that make sense? I'm not sure I'm explaining this in the bestpossible manner...


--Thilo

Re: Thread-safe versions of some of the tools

Reply via email to