I like this idea. The only downside is that folks will tend to think it's a full Java Pattern and try other options. :)
On Thu, Sep 17, 2020 at 9:09 PM Steve Rowe <[email protected]> wrote: > > You could avoid (some of?) these problems by supporting /(?i)foo/ instead of > /foo/i > > -- > Steve > > On Sep 17, 2020, at 1:55 PM, Gus Heck <[email protected]> wrote: > > And as I understand it, current behavior is the silent misinterpretation. To > me, the failure to require a space after the regex (and either not become a > regex in that case or complain about invalid regex) might be considered a > bug... > > On Thu, Sep 17, 2020 at 9:30 AM Mark Harwood <[email protected]> wrote: >> >> I think the decision comes down to choosing between silent >> (mis)interpratations of ambiguous queries or noisy failures.. >> >> On Thu, Sep 17, 2020 at 1:55 PM Uwe Schindler <[email protected]> wrote: >>> >>> Hi, >>> >>> >>> >>> My idea would have been not to bee too strict and instead only detect it as >>> a regex if its separated. So /foo/bar and /foo/iphone would both go through >>> and ignoring the regex, only ‘/foo/ bar’ or ‘/foo/I phone’ would interpret >>> the first token as regex. >>> >>> >>> >>> That’s just my idea, not sure if it makes sense to have this relaxed >>> parsing. I was always very skeptical of adding the regexes, as it breaks >>> many queries. Now it’s even more. >>> >>> >>> >>> Uwe >>> >>> >>> >>> ----- >>> >>> Uwe Schindler >>> >>> Achterdiek 19, D-28357 Bremen >>> >>> https://www.thetaphi.de >>> >>> eMail: [email protected] >>> >>> >>> >>> From: Mark Harwood <[email protected]> >>> Sent: Wednesday, September 16, 2020 6:45 PM >>> To: [email protected] >>> Subject: Re: QueryParser - proposed change may break existing queries. >>> >>> >>> >>> The strictness I was thinking of adding was to make all of the following >>> error: >>> >>> /foo/bar >>> >>> /foo//bar/ >>> >>> /foo/iphone >>> >>> /foo/AND x >>> >>> >>> >>> These would be allowed: >>> >>> /foo/i bar >>> >>> (/foo/ OR /bar/) >>> >>> (/foo/ OR /bar/i) >>> >>> /foo/^2 >>> >>> /foo/i^2 >>> >>> >>> >>> >>> >>> >>> >>> On 16 Sep 2020, at 12:00, Uwe Schindler <[email protected]> wrote: >>> >>> >>> >>> In my opinion, the proposed syntax change should enforce to have whitespace >>> or any other separator chat after the regex “i” parameter. >>> >>> >>> >>> Uwe >>> >>> >>> >>> ----- >>> >>> Uwe Schindler >>> >>> Achterdiek 19, D-28357 Bremen >>> >>> https://www.thetaphi.de >>> >>> eMail: [email protected] >>> >>> >>> >>> From: Mark Harwood <[email protected]> >>> Sent: Wednesday, September 16, 2020 11:04 AM >>> To: [email protected] >>> Subject: QueryParser - proposed change may break existing queries. >>> >>> >>> >>> In Lucene-9445 we'd like to add a case insensitive option to regex queries >>> in the query parser of the form: >>> >>> /Foo/i >>> >>> >>> >>> However, today people can search for : >>> >>> >>> >>> /foo.com/index.html >>> >>> >>> >>> and not get an error. The searcher may think this is a query for a URL but >>> it's actually parsed as a regex "foo.com" ORed with a term query. >>> >>> >>> >>> I'd like to draw attention to this proposed change in behaviour because I >>> think it could affect many existing systems. Arguably it may be a positive >>> in drawing attention to a number of existing silent failures (unescaped >>> searches for urls or file paths) but equally could be seen as a negative >>> breaking change by some. >>> >>> >>> >>> What is our BWC policy for changes to query parser? >>> >>> Do the benefits of the proposed new regex feature outweigh the costs of the >>> breakages in your view? >>> >>> >>> >>> https://issues.apache.org/jira/browse/LUCENE-9445?focusedCommentId=17196793&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17196793 >>> >>> >>> >>> > > > > -- > http://www.needhamsoftware.com (work) > http://www.the111shift.com (play) > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
