[
https://issues.apache.org/jira/browse/LUCENE-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661087#comment-16661087
]
Michael Gibney commented on LUCENE-8531:
----------------------------------------
> I think we should keep the default behavior as is. You can still override
> QueryBuilder#analyzeGraphPhrase to apply a different logic on your side if
> you want.
Certainly agreed the default behavior should be left as-is. I'm content with
the flexibility to override, but my suggestion was based on a sense that the
desire to support {{inOrder=true}} could be a pretty common use case.
The API does specify "phrase", but with a lower-case "p", does this necessarily
imply that exclusively {{PhraseQuery}} semantics _should_ be supported? It's
the de facto case that {{PhraseQuery}} semantics _have been_ supported, so it
definitely makes sense for that to continue to be the default – but I don't
think it'd be unreasonable to add configurable stock support for
{{inOrder=true}}. If such support were to be added, {{QueryBuilder}} would seem
like a logical place to do it, and since the logic necessary to implement is
already here (in {{analyzeGraphPhrase}}), it should be a trivial addition.
I'm thinking something along the lines of splitting the {{SpanNearQuery}} part
of {{analyzeGraphPhrase (}}everything after the "{{if (phraseSlop > 0)}}"
shortcircuit) into its own method. Even if split into a protected method, this
would allow any override of {{analyzeGraphPhrase}} to more cleanly leverage the
existing logic for building {{SpanNearQuery}}.
I'm just explaining my thinking here; I guess the decision ultimately depends
on how general a use case folks consider {{inOrder=true}} to be.
> QueryBuilder hard-codes inOrder=true for generated sloppy span near queries
> ---------------------------------------------------------------------------
>
> Key: LUCENE-8531
> URL: https://issues.apache.org/jira/browse/LUCENE-8531
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/queryparser
> Reporter: Steve Rowe
> Assignee: Steve Rowe
> Priority: Major
> Fix For: 7.6, master (8.0)
>
> Attachments: LUCENE-8531.patch
>
>
> QueryBuilder.analyzeGraphPhrase() generates SpanNearQuery-s with passed-in
> phraseSlop, but hard-codes inOrder ctor param as true.
> Before multi-term synonym support and graph token streams introduced the
> possibility of generating SpanNearQuery-s, QueryBuilder generated
> (Multi)PhraseQuery-s, which always interpret slop as allowing reordering
> edits. Solr's eDismax query parser generates phrase queries when its
> pf/pf2/pf3 params are specified, and when multi-term synonyms are used with a
> graph-aware synonym filter, SpanNearQuery-s are generated that require
> clauses to be in order; unlike with (Multi)PhraseQuery-s, reordering edits
> are not allowed, so this is a kind of regression. See SOLR-12243 for edismax
> pf/pf2/pf3 context. (Note that the patch on SOLR-12243 also addresses
> another problem that blocks eDismax from generating queries *at all* under
> the above-described circumstances.)
> I propose adding a new analyzeGraphPhrase() method that allows configuration
> of inOrder, which would allow eDismax to specify inOrder=false. The existing
> analyzeGraphPhrase() method would remain with its hard-coded inOrder=true, so
> existing client behavior would remain unchanged.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]