Of course IDF is a factor too meaning a match on a single rare (to the overall 
index) term may be worth more than a match on 2 different common (to the index) 
terms.
As Ian suggests a custom Similarity implementation can be used to tune this out.


----- Original Message ----
From: Ian Lea <ian....@gmail.com>
To: java-user@lucene.apache.org
Sent: Thu, 19 May, 2011 14:36:56
Subject: Re: Ranking docs with all terms higher

A little test shows that Mike is correct and lucene does already do this.

With norms (default)
nacho foo bar, score=0.8660254
foo bar bar, score=0.46461558
nacho nacho nacho nacho, score=0.19245009

Without norms
nacho foo bar, score=1.7320508
foo bar bar, score=0.92923117
nacho nacho nacho nacho, score=0.38490018

Lucene 3.1.0, nothing else in the index.


If you need to, you can tweak the scoring by providing a custom
implementation of oal.search.Similarity.


--
Ian.


On Thu, May 19, 2011 at 1:34 PM, Michael McCandless
<luc...@mikemccandless.com> wrote:
> I believe Lucene already does this, with the 'coord' factor in
> BooleanQuery, which is on by default (ie, if you just "new
> BooleanQuery()").
>
> Ie your doc c will get a coord factor of 1.0, doc b gets 0.666..., doc
> a gets 0.3333.
>
> That said, if the term freq is high enough (ie doc a has nacho 4
> times), that may give it a high enough score to overcome its coord
> disadvantage (I'm not sure).
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Wed, May 18, 2011 at 10:14 PM, Christopher Condit <con...@sdsc.edu> wrote:
>> Let's say I have the query
>> (nacho OR foo OR bar)
>>
>> and some documents (single field with norms off)
>> doc a: nacho nacho nacho nacho
>> doc b: foo bar bar
>> doc c: nacho foo bar
>>
>> I'm interested in all of these documents but I would like c to score the
>> highest since it contains all of the search terms, b to score second
>> because it has two, and a to score the third because it has one. What's
>> the best way to get Lucene to do this?
>>
>> Thanks,
>> -Chris
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to