[
https://issues.apache.org/jira/browse/LUCENE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796027#action_12796027
]
Uwe Schindler commented on LUCENE-2183:
---------------------------------------
{quote}
There is only one exception where reflection is used... that is during ctor to
determine if:
- you subclass a tokenizer that implements int-based methods
- you have only implemented char-based methods
- you request VERSION >= 3.1
{quote}
With LUCENE-2188, this is easy and no performance problem. Just define two
static final fields for both char-based methods and check in the ctor if
this.getClass() overrides the char-based method. In this case throw UOE. The
result is cached for the class and further instantiations of the same class
will not use reflection anymore:
{code}
private static final OverrideableMethod<CharTokenizer> isTokenCharMethod=
new OverrideableMethod<CharTokenizer>(CharTokenizer.class, "isTokenChar",
char.class);
private static final OverrideableMethod<CharTokenizer> normalizeMethod=
new OverrideableMethod<CharTokenizer>(CharTokenizer.class, "normalize",
char.class);
...
public CharTokenizer(...) {
super(...)
if (matchVersion.onOrAfter(Version.LUCENE_31) && (
isTokenCharMethod.getOverrideDistance(this.getClass()) > 0 ||
normalizeMethod.getOverrideDistance(this.getClass()) > 0
) throw new IAE("For matchVersion >= LUCENE_31, CharTokenizer subclasses must
not override isTokenChar(char) or normalize(char)."):
}
{code}
> Supplementary Character Handling in CharTokenizer
> -------------------------------------------------
>
> Key: LUCENE-2183
> URL: https://issues.apache.org/jira/browse/LUCENE-2183
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Reporter: Simon Willnauer
> Fix For: 3.1
>
> Attachments: LUCENE-2183.patch
>
>
> CharTokenizer is an abstract base class for all Tokenizers operating on a
> character level. Yet, those tokenizers still use char primitives instead of
> int codepoints. CharTokenizer should operate on codepoints and preserve bw
> compatibility.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]