[ https://issues.apache.org/jira/browse/LUCENE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794964#action_12794964 ]
Robert Muir commented on LUCENE-2183: ------------------------------------- I thought about this some, but i am worried about one thing: Consider LetterTokenizer, which is non-final subclass of CharTokenizer. Lets say you make LetterAndNumberTokenizer which extends LetterTokenizer, but you do not implement the int-based method. {code} public boolean isTokenChar(char c) { return super.isTokenChar(c) || Character.isNumber(c); } {code} we have fixed LetterTokenizer so it has isTokenChar(int), but that means if someone tries to use this LettterAndNumberTokenizer with Version.LUCENE_31, it will not work, because it will not throw UOE, and silently discard numbers... since it will call the LetterTokenizer int-based method. of course it will work correctly with Version.LUCENE_30, so it is not a back compat problem, but it will not throw UOE and silently behave incorrectly for LUCENE_31 until the 'int' method is implemented. so i think this is a problem in this design, and i do not know how to fix without reflection. > Supplementary Character Handling in CharTokenizer > ------------------------------------------------- > > Key: LUCENE-2183 > URL: https://issues.apache.org/jira/browse/LUCENE-2183 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Simon Willnauer > Fix For: 3.1 > > Attachments: LUCENE-2183.patch > > > CharTokenizer is an abstract base class for all Tokenizers operating on a > character level. Yet, those tokenizers still use char primitives instead of > int codepoints. CharTokenizer should operate on codepoints and preserve bw > compatibility. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org