[ https://issues.apache.org/jira/browse/LUCENE-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16452274#comment-16452274 ]
Alan Woodward commented on LUCENE-8273: --------------------------------------- Here's an updated patch: * now works with wrapped filters that emit more than one token (thanks David!) * renamed to ConditionalTokenFilter and the logic reversed (thanks Robert!) * cleaned up all the logic around reset(), close() and end() * integrated into testRandomChains. This latter one is a bit clunky, as this TokenFilter won't work with filters that consume more than one token at a time - eg ShingleFilter or SynonymGraphFilter. At the moment I have a blacklist, but there may be a better way of isolating that - preferably one that throws errors when you build the TokenStream. Speak up if you have any suggestions. I do like the idea of integrating things into CustomAnalyzer, will look at that next. > Add a BypassingTokenFilter > -------------------------- > > Key: LUCENE-8273 > URL: https://issues.apache.org/jira/browse/LUCENE-8273 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Priority: Major > Attachments: LUCENE-8273.patch, LUCENE-8273.patch > > > Spinoff of LUCENE-8265. It would be useful to be able to wrap a TokenFilter > in such a way that it could optionally be bypassed based on the current state > of the TokenStream. This could be used to, for example, only apply > WordDelimiterFilter to terms that contain hyphens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org