[
https://issues.apache.org/jira/browse/LUCENE-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2372:
----------------------------------
Attachment: LUCENE-2372.patch
Here a first patch for the core tokenstreams. Tests not yet changed.
The following things were additionally fixed:
- StandardAnalyzer was made final (backwards break, we forgot to made it final
in the 3.0 TS finalization issue). This enabled me to subclass
StopwordAnalyzerBase and remove heavy code duplication. The original code also
contained a bug in the tokenStream method (no setReplaceInvalidAcronym) which
was correctin reusableTokenStream. Now it is correct.
I will post further patches for core.
> Replace deprecated TermAttribute by new CharTermAttribute
> ---------------------------------------------------------
>
> Key: LUCENE-2372
> URL: https://issues.apache.org/jira/browse/LUCENE-2372
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 3.1
> Reporter: Uwe Schindler
> Fix For: 3.1
>
> Attachments: LUCENE-2372.patch
>
>
> After LUCENE-2302 is merged to trunk with flex, we need to carry over all
> tokenizers and consumers of the TokenStreams to the new CharTermAttribute.
> We should also think about adding a AttributeFactory that creates a subclass
> of CharTermAttributeImpl that returns collation keys in toBytesRef()
> accessor. CollationKeyFilter is then obsolete, instead you can simply convert
> every TokenStream to indexing only CollationKeys by changing the attribute
> implementation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]