That's a much better idea, I like it. It's basically what Javas regex parser in 
the Pattern class also does.

If we do this we won't even need a syntax change.

Uwe

Am September 17, 2020 7:09:18 PM UTC schrieb Steve Rowe <[email protected]>:
>You could avoid (some of?) these problems by supporting /(?i)foo/
>instead of /foo/i
>
>--
>Steve
>
>> On Sep 17, 2020, at 1:55 PM, Gus Heck <[email protected]> wrote:
>> 
>> And as I understand it, current behavior is the silent
>misinterpretation. To me, the failure to require a space after the
>regex (and either not become a regex in that case or complain about
>invalid regex) might be considered a bug...
>> 
>> On Thu, Sep 17, 2020 at 9:30 AM Mark Harwood <[email protected]
><mailto:[email protected]>> wrote:
>> I think the decision comes down to choosing between silent
>(mis)interpratations of ambiguous queries or noisy failures..
>> 
>> On Thu, Sep 17, 2020 at 1:55 PM Uwe Schindler <[email protected]
><mailto:[email protected]>> wrote:
>> Hi,
>> 
>>  
>> 
>> My idea would have been not to bee too strict and instead only detect
>it as a regex if its separated. So /foo/bar and /foo/iphone would both
>go through and ignoring the regex, only ‘/foo/ bar’ or ‘/foo/I phone’
>would interpret the first token as regex.
>> 
>>  
>> 
>> That’s just my idea, not sure if it makes sense to have this relaxed
>parsing. I was always very skeptical of adding the regexes, as it
>breaks many queries. Now it’s even more.
>> 
>>  
>> 
>> Uwe
>> 
>>  
>> 
>> -----
>> 
>> Uwe Schindler
>> 
>> Achterdiek 19, D-28357 Bremen
>> 
>> https://www.thetaphi.de <https://www.thetaphi.de/>
>> eMail: [email protected] <mailto:[email protected]>
>>  
>> 
>> From: Mark Harwood <[email protected]
><mailto:[email protected]>> 
>> Sent: Wednesday, September 16, 2020 6:45 PM
>> To: [email protected] <mailto:[email protected]>
>> Subject: Re: QueryParser - proposed change may break existing
>queries.
>> 
>>  
>> 
>> The strictness I was thinking of adding was to make all of the
>following error:
>> 
>>  /foo/bar
>> 
>>  /foo//bar/
>> 
>>  /foo/iphone 
>> 
>>  /foo/AND x
>> 
>>  
>> 
>> These would be allowed:
>> 
>>  /foo/i bar
>> 
>>  (/foo/ OR /bar/)
>> 
>>  (/foo/ OR /bar/i)
>> 
>>  /foo/^2
>> 
>>  /foo/i^2
>> 
>>  
>> 
>>  
>> 
>> 
>> 
>> 
>> On 16 Sep 2020, at 12:00, Uwe Schindler <[email protected]
><mailto:[email protected]>> wrote:
>> 
>> 
>> 
>> In my opinion, the proposed syntax change should enforce to have
>whitespace or any other separator chat after the regex “i” parameter.
>> 
>>  
>> 
>> Uwe
>> 
>>  
>> 
>> -----
>> 
>> Uwe Schindler
>> 
>> Achterdiek 19, D-28357 Bremen
>> 
>> https://www.thetaphi.de <https://www.thetaphi.de/>
>> eMail: [email protected] <mailto:[email protected]>
>>  
>> 
>> From: Mark Harwood <[email protected]
><mailto:[email protected]>> 
>> Sent: Wednesday, September 16, 2020 11:04 AM
>> To: [email protected] <mailto:[email protected]>
>> Subject: QueryParser - proposed change may break existing queries.
>> 
>>  
>> 
>> In Lucene-9445 we'd like to add a case insensitive option to regex
>queries in the query parser of the form: 
>> 
>>    /Foo/i
>> 
>>  
>> 
>> However, today people can search for :
>> 
>>  
>> 
>>    /foo.com/index.html <http://foo.com/index.html>
>>  
>> 
>> and not get an error. The searcher may think this is a query for a
>URL but it's actually parsed as a regex "foo.com <http://foo.com/>"
>ORed with a term query.
>> 
>>  
>> 
>> I'd like to draw attention to this proposed change in behaviour
>because I think it could affect many existing systems. Arguably it may
>be a positive in drawing attention to a number of existing silent
>failures (unescaped searches for urls or file paths) but equally could
>be seen as a negative breaking change by some.
>> 
>>  
>> 
>> What is our BWC policy for changes to query parser?
>> 
>> Do the benefits of the proposed new regex feature outweigh the costs
>of the breakages in your view?
>> 
>>  
>> 
>>
>https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793
><https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17196793>
>>  
>> 
>>  
>> 
>> 
>> 
>> -- 
>> http://www.needhamsoftware.com <http://www.needhamsoftware.com/>
>(work)
>> http://www.the111shift.com <http://www.the111shift.com/> (play)

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://www.thetaphi.de

Reply via email to