Re: QueryParser - proposed change may break existing queries.

Dawid Weiss Thu, 17 Sep 2020 12:59:55 -0700

I like this idea. The only downside is that folks will tend to think
it's a full Java Pattern and try other options. :)


On Thu, Sep 17, 2020 at 9:09 PM Steve Rowe <[email protected]> wrote:
>
> You could avoid (some of?) these problems by supporting /(?i)foo/ instead of 
> /foo/i
>
> --
> Steve
>
> On Sep 17, 2020, at 1:55 PM, Gus Heck <[email protected]> wrote:
>
> And as I understand it, current behavior is the silent misinterpretation. To 
> me, the failure to require a space after the regex (and either not become a 
> regex in that case or complain about invalid regex) might be considered a 
> bug...
>
> On Thu, Sep 17, 2020 at 9:30 AM Mark Harwood <[email protected]> wrote:
>>
>> I think the decision comes down to choosing between silent 
>> (mis)interpratations of ambiguous queries or noisy failures..
>>
>> On Thu, Sep 17, 2020 at 1:55 PM Uwe Schindler <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> My idea would have been not to bee too strict and instead only detect it as 
>>> a regex if its separated. So /foo/bar and /foo/iphone would both go through 
>>> and ignoring the regex, only ‘/foo/ bar’ or ‘/foo/I phone’ would interpret 
>>> the first token as regex.
>>>
>>>
>>>
>>> That’s just my idea, not sure if it makes sense to have this relaxed 
>>> parsing. I was always very skeptical of adding the regexes, as it breaks 
>>> many queries. Now it’s even more.
>>>
>>>
>>>
>>> Uwe
>>>
>>>
>>>
>>> -----
>>>
>>> Uwe Schindler
>>>
>>> Achterdiek 19, D-28357 Bremen
>>>
>>> https://www.thetaphi.de
>>>
>>> eMail: [email protected]
>>>
>>>
>>>
>>> From: Mark Harwood <[email protected]>
>>> Sent: Wednesday, September 16, 2020 6:45 PM
>>> To: [email protected]
>>> Subject: Re: QueryParser - proposed change may break existing queries.
>>>
>>>
>>>
>>> The strictness I was thinking of adding was to make all of the following 
>>> error:
>>>
>>>  /foo/bar
>>>
>>>  /foo//bar/
>>>
>>>  /foo/iphone
>>>
>>>  /foo/AND x
>>>
>>>
>>>
>>> These would be allowed:
>>>
>>>  /foo/i bar
>>>
>>>  (/foo/ OR /bar/)
>>>
>>>  (/foo/ OR /bar/i)
>>>
>>>  /foo/^2
>>>
>>>  /foo/i^2
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 16 Sep 2020, at 12:00, Uwe Schindler <[email protected]> wrote:
>>>
>>> 
>>>
>>> In my opinion, the proposed syntax change should enforce to have whitespace 
>>> or any other separator chat after the regex “i” parameter.
>>>
>>>
>>>
>>> Uwe
>>>
>>>
>>>
>>> -----
>>>
>>> Uwe Schindler
>>>
>>> Achterdiek 19, D-28357 Bremen
>>>
>>> https://www.thetaphi.de
>>>
>>> eMail: [email protected]
>>>
>>>
>>>
>>> From: Mark Harwood <[email protected]>
>>> Sent: Wednesday, September 16, 2020 11:04 AM
>>> To: [email protected]
>>> Subject: QueryParser - proposed change may break existing queries.
>>>
>>>
>>>
>>> In Lucene-9445 we'd like to add a case insensitive option to regex queries 
>>> in the query parser of the form:
>>>
>>>    /Foo/i
>>>
>>>
>>>
>>> However, today people can search for :
>>>
>>>
>>>
>>>    /foo.com/index.html
>>>
>>>
>>>
>>> and not get an error. The searcher may think this is a query for a URL but 
>>> it's actually parsed as a regex "foo.com" ORed with a term query.
>>>
>>>
>>>
>>> I'd like to draw attention to this proposed change in behaviour because I 
>>> think it could affect many existing systems. Arguably it may be a positive 
>>> in drawing attention to a number of existing silent failures (unescaped 
>>> searches for urls or file paths) but equally could be seen as a negative 
>>> breaking change by some.
>>>
>>>
>>>
>>> What is our BWC policy for changes to query parser?
>>>
>>> Do the benefits of the proposed new regex feature outweigh the costs of the 
>>> breakages in your view?
>>>
>>>
>>>
>>> https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793
>>>
>>>
>>>
>>>
>
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: QueryParser - proposed change may break existing queries.

Reply via email to