[ 
https://issues.apache.org/jira/browse/LUCENE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17196793#comment-17196793
 ] 

Mark Harwood commented on LUCENE-9445:
--------------------------------------

We will likely have to break existing parser behaviour to add this feature.

The existing parser impl allows regex clauses and other clauses to appear next 
to each other without a space e.g.

{{  /foo/bar}}

is interpreted as a regex query "foo" OR term query "bar".

If we want to introduce /Foo/i as syntax for a case insensitive regex then we 
will need to insist on a space between regexes and other search terms to 
cleanly separate any regex flags from other search terms. This will require 
throwing an error if we encounter any text other than "i" after the closing /. 
It will still be legal to have operators like ) for boolean logic or ^ for 
boosts immediately after the regex closing slash but any other tokens will 
cause an error. I have discussed with Jim Ferenczi the idea of a flag to 
control what happens in an error situation but we can't easily revert to 
parsing the offending string in a BWC way and it's not ideal to silently drop 
it either. Always throwing an error seems like the only viable option.

The repercussions of stricter parsing is that sloppy searches like a 
cut-and-paste URL or file path would now throw an error. These would fail 
loudly for example:

{{  [http://foo.com/bar]}}
 {{  /mydrive/myfolder/myfile}}

 

The question is does this new regex syntax warrant the breaking change this 
introduces, or more broadly, what is the BWC policy for making changes to query 
parser?

 

 

 

 

> Expose new case insensitive RegExpQuery support in QueryParser
> --------------------------------------------------------------
>
>                 Key: LUCENE-9445
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9445
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/queryparser
>            Reporter: Mark Harwood
>            Assignee: Mark Harwood
>            Priority: Minor
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> LUCENE-9386 added a case insensitive matching option to RegExpQuery.
> This proposal is to extend the QueryParser syntax to allow for an optional 
> `i` (case Insensitive) flag to appear on the end of regular expressions e.g. 
> /Foo/i
>  
> This is regex syntax supported by a number of programming languages.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to