a "fair" similarity

2006-08-14 Thread Daniel Naber
Hi, as some of you may have noticed, Lucene prefers shorter documents over longer ones, i.e. shorter documents get a higher ranking, even if the ratio "matched terms / total terms in document" is the same. For example, take these two artificial documents: doc1: x 2 3 4 5 6 7 8 9 10 doc2: x x 3

Re: a "fair" similarity

2008-01-21 Thread Fabrice Robini
Hi, I've tried this "fair" similarity with lucene 2.2 but it does not seems to work. I've attached the custom "MyFair" similarity to bith IndexWriter and IndexSearcher. Do you have any idea ? Thanks a lot, Fabrice Daniel Naber-5 wrote: > > Hi, > > as some of you may have noticed, Lucene p

Re: a "fair" similarity

2006-11-21 Thread Bob Carpenter
Michael D. Curtin wrote: Daniel Naber wrote: Hi, as some of you may have noticed, Lucene prefers shorter documents over longer ones, i.e. shorter documents get a higher ranking, even if the ratio "matched terms / total terms in document" is the same. There's even more interesting kinds of

Re: a "fair" similarity

2006-08-14 Thread Michael D. Curtin
Daniel Naber wrote: Hi, as some of you may have noticed, Lucene prefers shorter documents over longer ones, i.e. shorter documents get a higher ranking, even if the ratio "matched terms / total terms in document" is the same. For example, take these two artificial documents: doc1: x 2 3 4 5