[ https://issues.apache.org/jira/browse/LUCENE-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451265#comment-16451265 ]
Robert Muir commented on LUCENE-8273: ------------------------------------- {quote} I added this to core rather than to the analysis module as it seems to me to be a utility class like FilteringTokenFilter, which is also in core. But I'm perfectly happy to move it to analysis-common if that makes more sense to others. {quote} The idea is cool but I would like to see it more fleshed out (eg. marked experimental somewhere) before going into core/: * improved testing: i'd like to see some edge cases tested such as both "true" and "false" cases on the final token for end(), etc. what happens is a little sneaky, think it should be hooked into TestRandomChains (this should probably be explicitly added to that test, wrapping with check of random.nextBoolean() or something simple that will test all cases). This may uncover some integration difficulties. In particular, it is not clear to me how some stuff such as end() works correctly in the general case with this filter right now. * integration with CustomAnalyzer: as this would add a generic "if" to allow branching in analysis chains (there is an issue somewhere for this), which would be very powerful, it would be good to plumb into CustomAnalyzer to make sure it can work well with the factory model. seems doable with the functional interface but needs to be proven out. > Add a BypassingTokenFilter > -------------------------- > > Key: LUCENE-8273 > URL: https://issues.apache.org/jira/browse/LUCENE-8273 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Alan Woodward > Priority: Major > Attachments: LUCENE-8273.patch > > > Spinoff of LUCENE-8265. It would be useful to be able to wrap a TokenFilter > in such a way that it could optionally be bypassed based on the current state > of the TokenStream. This could be used to, for example, only apply > WordDelimiterFilter to terms that contain hyphens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org