Re: Omitting tf but not positions

Robert Zotter Fri, 25 Feb 2011 11:06:26 -0800

Jan,

You are correct, you'll need your own Similarity class.

Have a look at SweetSpotSimilarity(http://lucene.apache.org/java/3_0_3/api/contrib-misc/org/apache/lucene/misc/SweetSpotSimilarity.html)


On 2/25/11 10:57 AM, Jan Høydahl wrote:

I also have a case (yellow-page) where IDF comes in and destroys the rank.
A company listing with a word which occurs in few other listings is not 
necessarily better than others just because of that. When it gets to the 
extreme value of IDF=1, we get an artificially high IDF boost.

It is not killed by omitNorms, neither by omitTermFrequencyAndPositions. Any 
per-field way to get rid of the IDF effect?
Or should I override idf() in Similarity?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 15. des. 2010, at 13.27, Robert Muir wrote:

On Wed, Dec 15, 2010 at 3:09 AM, Jan Høydahl / Cominvent
<jan....@cominvent.com>  wrote:

Any way to disable TF/IDF normalization without also disabling positions?

see Similarity.tf(float) and Similarity.tf(int)

if you want to change this for both terms and phrases just override
Similarity.tf(float), since by default Similarity.tf(int) delegates to
that.
otherwise, override both.

of course the big limitation being you cant customize Similarity per-field yet.

Re: Omitting tf but not positions

Reply via email to