Gus Heck created LUCENE-9696: -------------------------------- Summary: RegExp with group references Key: LUCENE-9696 URL: https://issues.apache.org/jira/browse/LUCENE-9696 Project: Lucene - Core Issue Type: Wish Reporter: Gus Heck
PatternTypingFilter presently relies on java util regexes, but LUCENE-7465 found performance benefits using our own RegExp class instead. Unfortunately RegExp does not currently report matching subgroups which is key to PatternTypingFilter's use (and probably useful in other endeavors as well). What's needed is reporting of sub-groups such that new RegExp("(foo(.+)")) -->> converted to run atomaton etc --> match found for "foobar" --> somehow reports getGroup(1) as "bar" And getGroup() can be called on some object reasonably accessible to the code using RegExp in the first place. Clearly there's a lot to be worked out there since the normal usage pattern converts things to a DFA / run Automaton etc, and subgroups are not a natural concept for those classes. But if this could be achieved without loosing the performance benefits, that would be interesting :). Opening this Wish ticket as encouraged by [~mikemccand] in LUCENE-9575. I won't be able to work on it any time soon to encourage anyone else interested to pick it up or to drop links or ideas in here. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org