RE: Can't get phrase field boosting to work using edismax

jimi.hullegard Tue, 05 Apr 2016 09:52:15 -0700

I now used the Eclipse debugger, to try and see if I can understand what is 
happening, I it seems like the ExtendedDismaxQParser simply ignores my pf 
parameter, since it doesn't interpret it as a phrase query.


https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.6.0/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java

On line 1180 I get a query object of type TermQuery (with the term 
"exactTitle:some words"). And in the if statements starting at line it is quite 
clear that if it is not a PhraseQuery or a MultiPhraseQuery, or if the 
minClauseSize > 1 (and it is set to 2 on line 550) the method simply returns 
null (ie ignoring my pf parameter). Why is this happening?

I use Solr 4.6 by the way... I forgot to mention that in my original message.


-----Original Message-----
From: jimi.hulleg...@svensktnaringsliv.se 
[mailto:jimi.hulleg...@svensktnaringsliv.se] 
Sent: Tuesday, April 5, 2016 5:36 PM
To: solr-user@lucene.apache.org
Subject: RE: Can't get phrase field boosting to work using edismax

OK. Interesting. But... I added a solr.TrimFilterFactory at the end of my 
analyzer definition. Shouldn't that take care of the added space at the end? 
The admin analysis page indicates that it works as it should, but I still can't 
get edismax to boost.

-----Original Message-----
From: Jack Krupansky [mailto:jack.krupan...@gmail.com]
Sent: Tuesday, April 5, 2016 4:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Can't get phrase field boosting to work using edismax

It looks like the code constructing the boost phrase for pf will always add a 
trailing blank, which is never a problem when a normal tokenizer is used that 
removes white space, but the keyword tokenizer will preserve that extra space, 
which prevents an exact match.

See line 531:
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/5.5.0/solr/core/src/java/org/apache/solr/search/ExtendedDismaxQParser.java

I'd say it's a bug, but more a narrow use case that wasn't considered or tested.

-- Jack Krupansky

On Tue, Apr 5, 2016 at 7:50 AM, <jimi.hulleg...@svensktnaringsliv.se> wrote:

> Hi,
>
> I'm trying to boost documents using a phrase field boosting (ie the pf 
> parameter for edismax), but I can't get it to work (ie boosting 
> documents where the pf field match the query as a phrase).
>
> As far as I can tell, solr, or more specifically the edismax handler, 
> does
> *something* when I add this parameter. I know this because the QTime 
> increases from around 5-10ms to around 30-40 ms, and the score explain 
> structure is *slightly* modified (though with the same final score for 
> all documents). But nowhere in the explain structure can I see 
> anything about the pf. And I can't understand that. Shouldn't it be 
> included in the explain? If not, is there any way to force it to be included 
> somehow?
>
> The query looks something like this:
>
>
> ?q=some+words&rows=10&sort=score+desc&debugQuery=true&fl=objectid,exac
> tTitle,score%2C%5Bexplain+style%3Dtext%5D&qf=title%5E2&qf=swedishText1
> %5E1&defType=edismax&pf=exactTitle%5E5&wt=xml&indent=true
>
>
> I have one document that has the title "some words", and when I do a 
> simple query filter with exactTitle:"some words" I get a match for 
> that document. So then I would expect that the query above would boost 
> this document, and include information about this in the explain. But 
> nothing like this happens, and I can't understand why.
>
> The field looks like this:
>
> <field name="exactTitle" type="keywordText" indexed="true" stored="true"
> required="false" multiValued="false" />
>
> And the fieldType looks like this:
>
> <fieldType name="keywordText" class="solr.TextField"
> positionIncrementGap="100">
>                          <analyzer>
>                                                   <charFilter 
> class="solr.HTMLStripCharFilterFactory" />
>                                                   <tokenizer 
> class="solr.KeywordTokenizerFactory" />
>                                                   <filter 
> class="solr.LowerCaseFilterFactory" />
>                          </analyzer>
> </fieldType>
>
>
> I have also tried boosting this document using a boost query, ie 
> bq=exactTitle:"some words", and this works as expected. The document 
> score is boosted, and the explain states this very clearly, with this segment:
>
> [...]
> 9.870669 = (MATCH) weight(exactTitle:some words^5.0 in 12) 
> [DefaultSimilarity], result of:
> [...]
>
> Why is this working, but q=some+words&pf=exactTitle^5 not? Shouldn't 
> edismax rewrite my "pf query" into something very similar to the "bq query"?
>
> Regards
> /Jimi
>

RE: Can't get phrase field boosting to work using edismax

Reply via email to