[
https://issues.apache.org/jira/browse/SOLR-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267272#comment-13267272
]
Uwe Schindler commented on SOLR-3424:
-------------------------------------
Hi David,
I don't really trust that thread so I would really go with the current
solution. In Lucene 4.0, reuse of TokenStreams is *enforced* so there is no
cost at all by the reflection and the createInstance. I also improved the
reflective method call (which has the side effect of being able to print a
useful message when the encoder does not support setMaxLen, the old code
throwed unspecified and unhelpful Exception).
In general, the commons encoders might be thread safe, but we use reflection
and use any class the user provides and other classes implementing the abstract
Encoder interface may not be thread safe.
As it costs us nothing (Class.newInstance() is cheap and only called once a new
thread is used and the SolrAnalyzer class has to create new
TokenStreamComponents), we should stay with the current code. The thread-safety
is then ensured by the Analyzer class (it uses thread-locals to reuse
TokenStreams).
What do you think about my latest patch?
> PhoneticFilterFactory threadsafety bug
> --------------------------------------
>
> Key: SOLR-3424
> URL: https://issues.apache.org/jira/browse/SOLR-3424
> Project: Solr
> Issue Type: Bug
> Components: Schema and Analysis
> Affects Versions: 3.6, 4.0
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-3424_PhoneticFilterFactory_threadsafety_bug.patch,
> SOLR-3424_PhoneticFilterFactory_threadsafety_bug.patch,
> SOLR-3424_PhoneticFilterFactory_threadsafety_bug.patch,
> SOLR-3424_PhoneticFilterFactory_threadsafety_bug.patch,
> SOLR-3424_PhoneticFilterFactory_threadsafety_bug.patch,
> SOLR-3424_PhoneticFilterFactory_threadsafety_bug.patch
>
>
> PhoneticFilterFactory has a static HashMap registry mapping an encoder name
> to an implementation. There is a ReentrantLock used when the map is modified
> (when the encoder config specifies a class name). However, this map, which
> can be accessed by multiple indexing threads, isn't guarded on any of the
> reads, which isn't just the common path but also the error messages which
> dump the registry into the error message.
> I realize the likelihood of a problem is extremely slim, but a bug's a bug.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]