[ https://issues.apache.org/jira/browse/SOLR-12243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668790#comment-16668790 ]
Steve Rowe commented on SOLR-12243: ----------------------------------- {quote}I think the permutation problem is not new with the recent Lucene fixes. This problem should also have happened with Span expansions, right? Maybe we should add an option to limit the number of phrase expansions (as a safety feature). If those limits are reached, the phrase expansion should be stopped (maybe then only bigrams and no trigrams). {quote} I don't know how Span queries are rewritten, or how the search time complexity would work out, but AFAIK the Lucene fixes didn't change the permutation problem, just recast it as explicit clauses. As far as safety is concerned, doesn't the Boolean clause limit already apply in this case, since the generated query is a BooleanQuery of PhraseQuery-s? Elizabeth, I like the idea of adding something along the lines of the text you suggested to the ref guide. I made a few tweaks (described below) - please let me know if you think this is okay: {quote} h3. Synonyms expansion in phrase queries with slop When a phrase query with slop (e.g. {{pf}} with {{ps}}) triggers synonym expansions, a separate clause will be generated for each combination of synonyms. For example, with configured synonyms {{dog,canine}} and {{cat,feline}}, the query {{"dog chased cat"}} will generate the following phrase query clauses: * {{"dog chased cat"}} * {{"canine chased cat"}} * {{"dog chased feline"}} * {{"canine chased feline"}}{quote} My changes: this situation happens with all synonyms, not just multi-term synonyms; user-specified phrase queries (in {{q}} param) trigger this situation when {{qs}} is specified, so I generalized it a bit to refer to all phrase+slop contexts; and I think "combination" is better than "permutation" here. > Edismax missing phrase queries when phrases contain multiterm synonyms > ---------------------------------------------------------------------- > > Key: SOLR-12243 > URL: https://issues.apache.org/jira/browse/SOLR-12243 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: 7.1 > Environment: RHEL, MacOS X > Do not believe this is environment-specific. > Reporter: Elizabeth Haubert > Assignee: Uwe Schindler > Priority: Major > Attachments: SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, > SOLR-12243.patch, SOLR-12243.patch, SOLR-12243.patch, multiword-synonyms.txt, > schema.xml, solrconfig.xml > > Time Spent: 10m > Remaining Estimate: 0h > > synonyms.txt: > {code} > allergic, hypersensitive > aspirin, acetylsalicylic acid > dog, canine, canis familiris, k 9 > rat, rattus > {code} > request handler: > {code:xml} > <requestHandler name="/test_qparse_error" class="solr.SearchHandler"> > <lst name="defaults"> > <!-- Query settings --> > <str name="defType">edismax</str> > <str name="tie"> 0.4</str> > <str name="qf">title^100</str> > <str name="pf">title~20^5000</str> > <str name="pf2">title~11</str> > <str name="pf3">title~22^1000</str> > <str name="df">text</str> > <!-- mm If two or fewer clauses exist, they all must match. > If three to five clauses exist, one can be missing. If six to eight clauses > exist, all but three must match. > If more than nine clauses exist, only require 30% to match.--> > <str name="mm">3<-1 6<-3 9<30%</str> > <str name="q.alt">*:*</str> > <str name="rows">25</str> > </lst> > </requestHandler> > {code} > Phrase queries (pf, pf2, pf3) containing "dog" or "aspirin" against the > above list will not be generated. > "allergic reaction dog" will generate pf2: "allergic reaction", but not > pf:"allergic reaction dog", pf2: "reaction dog", or pf3: "allergic reaction > dog" > "aspirin dose in rats" will generate pf3: "dose ? rats" but not pf2: "aspirin > dose" or pf3:"aspirin dose ?" > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org