[
https://issues.apache.org/jira/browse/LUCENE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227055#comment-13227055
]
Robert Muir commented on LUCENE-3841:
-------------------------------------
{quote}
Separately: maybe SnowballAnalyzer is too heavy...? Does it have some static
data that ought to be loaded once and shared across analyzers... but isn't
today?
{quote}
I think the analyzers are going to be heavy.
If we start going down the path of trying to speed up their instantiation time,
then I vote to remove reusable tokenstreams completely.
That is: i don't think we should suffer the 'worst of both worlds'. either we
go thru all the effort to make things reusable, or we dont
and instead worry about instantiation time, etc.
> CloseableThreadLocal does not work well with Tomcat thread pooling
> ------------------------------------------------------------------
>
> Key: LUCENE-3841
> URL: https://issues.apache.org/jira/browse/LUCENE-3841
> Project: Lucene - Java
> Issue Type: Bug
> Components: core/other
> Affects Versions: 3.5
> Environment: Lucene/Tika/Snowball running in a Tomcat web application
> Reporter: Matthew Bellew
> Assignee: Michael McCandless
> Fix For: 3.6, 4.0
>
>
> We tracked down a large memory leak (effectively a leak anyway) caused
> by how Analyzer users CloseableThreadLocal.
> CloseableThreadLocal.hardRefs holds references to Thread objects as
> keys. The problem is that it only frees these references in the set()
> method, and SnowballAnalyzer will only call set() when it is used by a
> NEW thread.
> The problem scenario is as follows:
> The server experiences a spike in usage (say by robots or whatever)
> and many threads are created and referenced by
> CloseableThreadLocal.hardRefs. The server quiesces and lets many of
> these threads expire normally. Now we have a smaller, but adequate
> thread pool. So CloseableThreadLocal.set() may not be called by
> SnowBallAnalyzer (via Analyzer) for a _long_ time. The purge code is
> never called, and these threads along with their thread local storage
> (lucene related or not) is never cleaned up.
> I think calling the purge code in both get() and set() would have
> avoided this problem, but is potentially expensive. Perhaps using
> WeakHashMap instead of HashMap may also have helped. WeakHashMap
> purges on get() and set(). So this might be an efficient way to
> clean up threads in get(), while set() might do the more expensive
> Map.keySet() iteration.
> Our current work around is to not share SnowBallAnalyzer instances
> among HTTP searcher threads. We open and close one on every request.
> Thanks,
> Matt
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]