Re: Question about field boost

Erick Erickson Tue, 23 Jul 2013 05:16:01 -0700

this isn't doing what you think.
title^10 content
is actually parsed as

text:title^100 text:content


where "text" is my default search field.

assuming title is a field. If you look a little
farther up the debug output you'll see that.

You probably want
title:content^100 or some such?

Erick

On Tue, Jul 23, 2013 at 1:43 AM, Jack Krupansky <j...@basetechnology.com> wrote:
> That means that for that document "china" occurs in the title vs. "snowden"
> found in a document but not in the title.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Joe Zhang
> Sent: Tuesday, July 23, 2013 12:52 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Question about field boost
>
>
> Is my reading correct that the boost is only applied on "china" but not
> "snowden"? How can that be?
>
> My query is: q=china+snowden&qf=title^10 content
>
>
> On Mon, Jul 22, 2013 at 9:43 PM, Joe Zhang <smartag...@gmail.com> wrote:
>
>> Thanks for your hint, Jack. Here is the debug results, which I'm having a
>> hard deciphering (the two terms are "china" and "snowden")...
>>
>> 0.26839527 = (MATCH) sum of:
>>   0.26839527 = (MATCH) sum of:
>>     0.26757246 = (MATCH) max of:
>>       7.9147343E-4 = (MATCH) weight(content:china in 249), product of:
>>         0.019873314 = queryWeight(content:china), product of:
>>           1.6649085 = idf(docFreq=46832, maxDocs=91058)
>>           0.01193658 = queryNorm
>>         0.039825942 = (MATCH) fieldWeight(content:china in 249), product
>> of:
>>           4.8989797 = tf(termFreq(content:china)=24)
>>           1.6649085 = idf(docFreq=46832, maxDocs=91058)
>>           0.0048828125 = fieldNorm(field=content, doc=249)
>>       0.26757246 = (MATCH) weight(title:china^10.0 in 249), product of:
>>         0.5836803 = queryWeight(title:china^10.0), product of:
>>           10.0 = boost
>>           4.8898454 = idf(docFreq=1861, maxDocs=91058)
>>           0.01193658 = queryNorm
>>         0.45842302 = (MATCH) fieldWeight(title:china in 249), product of:
>>           1.0 = tf(termFreq(title:china)=1)
>>           4.8898454 = idf(docFreq=1861, maxDocs=91058)
>>           0.09375 = fieldNorm(field=title, doc=249)
>>     8.2282536E-4 = (MATCH) max of:
>>       8.2282536E-4 = (MATCH) weight(content:snowden in 249), product of:
>>         0.03407834 = queryWeight(content:snowden), product of:
>>           2.8549502 = idf(docFreq=14246, maxDocs=91058)
>>           0.01193658 = queryNorm
>>         0.024145111 = (MATCH) fieldWeight(content:snowden in 249), product
>> of:
>>           1.7320508 = tf(termFreq(content:snowden)=3)
>>           2.8549502 = idf(docFreq=14246, maxDocs=91058)
>>           0.0048828125 = fieldNorm(field=content, doc=249)
>>
>>
>> On Mon, Jul 22, 2013 at 9:27 PM, Jack Krupansky
>> <j...@basetechnology.com>wrote:
>>
>>> Maybe you're not doing anything wrong - other than having an artificial
>>> expectation of what the true relevance of your data actually is. Many
>>> factors go into relevance scoring. You need to look at all aspects of
>>> your
>>> data.
>>>
>>> Maybe your terms don't occur in your titles the way you think they do.
>>>
>>> Maybe you need a boost of 500 or more...
>>>
>>> Lots of potential maybes.
>>>
>>> Relevancy tuning is an art and craft, hardly a science.
>>>
>>> Step one: Know your data, inside and out.
>>>
>>> Use the debugQuery=true parameter on your queries and see how much of the
>>> score is dominated by your query terms in the non-title fields.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Joe Zhang
>>> Sent: Monday, July 22, 2013 11:06 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Question about field boost
>>>
>>>
>>> Dear Solr experts:
>>>
>>> Here is my query:
>>>
>>> defType=dismax&q=term1+term2&**qf=title^100 content
>>>
>>> Apparently (at least I thought) my intention is to boost the title field.
>>> While I'm getting some non-trivial results, I'm surprised that the
>>> documents with both term1 and term2 in title (I know such docs do exist
>>> in
>>> my repository) were not returned (or maybe ranked very low). The
>>> situation
>>> does not change even when I use much larger boost factors.
>>>
>>> What am I doing wrong?
>>>
>>
>>
>

Re: Question about field boost

Reply via email to