[
https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665837#comment-16665837
]
Steve Rowe commented on SOLR-12243:
-----------------------------------
{quote}
I think something is not right, but am not sure what.
[...]
Put breakpoints at ExtendedDismaxQParser.java in getQuery, and it looks like it
is getting a NullPointerException and falling out at ln.1449
{quote}
In addition to [~ehaubert]'s above-described manual test,
{{TestMultiWordSynonyms.testPf3WithoutReordering()}} in the patch was failing
with the same symptoms.
The problem: The LUCENE-8531 changes cause {{QueryBuilder}} to produce a new
kind of query structure for a phrase with multi-term synonyms and non-zero
slop: a {{BooleanQuery}} of {{PhraseQuery}}-s.
{{ExtendedDismaxQParser.getQuery()}} assumes that {{BooleanQuery}}-s always
consist of {{TermQuery}}-s, and so unconditionally sets the query's
minShouldMatch, but since the parser used to construct the {{pf3}} phrase
shingles had never had its minShouldMatch spec set, it remained null, causing
an NPE when trim was called on it in {{SolrPluginUtils.setMinShouldMatch()}}.
I've attached a modified version of Elizabeth's patch that includes an
{{ExtendedDismaxQParser.getQuery()}} fix: don't set a {{BooleanQuery}}'s
minShouldMatch when {{type==QType.PHRASE}}. The modified patch also uncomments
{{TestMultiWordSynonyms.testPf3WithReordering()}}. All the tests in
{{TestMultiWordSynonyms}} now pass with the patch. I haven't tried to run all
Solr tests yet.
{quote}
Hi, the Lucene issue was committed. I think we can now test this. Nevertheless,
according to my understanding, as for slop!=0 it no longer creates span
queries, the bug is fixed anyways. For slop=0 it creates (faster) span queries,
so the fixes here should apply.
Nevertheless there should be a test for slop=0 and slop!=0 in Edismax tests.
{quote}
Next week I'll look at adding what else ^^ needs testing.
> Edismax missing phrase queries when phrases contain multiterm synonyms
> ----------------------------------------------------------------------
>
> Key: SOLR-12243
> URL: https://issues.apache.org/jira/browse/SOLR-12243
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: query parsers
> Affects Versions: 7.1
> Environment: RHEL, MacOS X
> Do not believe this is environment-specific.
> Reporter: Elizabeth Haubert
> Assignee: Uwe Schindler
> Priority: Major
> Attachments: SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch,
> SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, multiword-synonyms.txt,
> schema.xml, solrconfig.xml
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> synonyms.txt:
> {code}
> allergic, hypersensitive
> aspirin, acetylsalicylic acid
> dog, canine, canis familiris, k 9
> rat, rattus
> {code}
> request handler:
> {code:xml}
> <requestHandler name="/test_qparse_error" class="solr.SearchHandler">
> <lst name="defaults">
> <!-- Query settings -->
> <str name="defType">edismax</str>
> <str name="tie"> 0.4</str>
> <str name="qf">title^100</str>
> <str name="pf">title~20^5000</str>
> <str name="pf2">title~11</str>
> <str name="pf3">title~22^1000</str>
> <str name="df">text</str>
> <!-- mm If two or fewer clauses exist, they all must match.
> If three to five clauses exist, one can be missing. If six to eight clauses
> exist, all but three must match.
> If more than nine clauses exist, only require 30% to match.-->
> <str name="mm">3<-1 6<-3 9<30%</str>
> <str name="q.alt">*:*</str>
> <str name="rows">25</str>
> </lst>
> </requestHandler>
> {code}
> Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the
> above list will not be generated.
> "allergic reaction dog" will generate pf2: "allergic reaction", but not
> pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction
> dog"
> "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin
> dose" or pf3:"aspirin dose ?"
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]