[ https://issues.apache.org/jira/browse/LUCENE-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16620617#comment-16620617 ]
Alan Woodward commented on LUCENE-8498: --------------------------------------- Precommit has caught an interesting wrinkle here, in that CharTokenizer also allows you to combine tokenization and normalization. As currently written, a CharTokenizer combined with a normalizer will not have its normalization applied when Analyzer.normalize() is called. Should we also remove the normalization functions from CharTokenizer? cc [~thetaphi] > Deprecate/Remove LowerCaseTokenizer > ----------------------------------- > > Key: LUCENE-8498 > URL: https://issues.apache.org/jira/browse/LUCENE-8498 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Priority: Major > Attachments: LUCENE-8498.patch > > > LowerCaseTokenizer combines tokenization and filtering in a way that prevents > us improving the normalization API. We should deprecate and remove it, as it > can be replaced simply with a LetterTokenizer and LowerCaseFilter. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org