Usefulness of Similarity.queryNorm()

2008-02-12 Thread Marvin Humphrey
Greets, What would the consequences be of eliminating Similarity.queryNorm()? I cargo-culted that method when porting, but now I'm going through and trying to refactor for simplicity's sake. If I can zap it, I'd like to. First, the theoretical angle: According to the Similarity docs, que

Re: Usefulness of Similarity.queryNorm()

2008-02-12 Thread Marvin Humphrey
On Feb 12, 2008, at 9:08 AM, Marvin Humphrey wrote: What would the consequences be of eliminating Similarity.queryNorm()? I cargo-culted that method when porting, but now I'm going through and trying to refactor for simplicity's sake. If I can zap it, I'd like to. I infer from the deaf

Re: Usefulness of Similarity.queryNorm()

2008-02-12 Thread Grant Ingersoll
:-) I don't know a lot about it, but my understanding has always been that comparing across queries is difficult at best, so that would argue for removing it, but I haven't done any research into it. I think it has been in Lucene for a good long time, so it may be that the history of why

Re: Usefulness of Similarity.queryNorm()

2008-02-12 Thread Marvin Humphrey
On Feb 12, 2008, at 5:04 PM, Grant Ingersoll wrote: I don't know a lot about it, but my understanding has always been that comparing across queries is difficult at best, so that would argue for removing it, but I haven't done any research into it. I think it has been in Lucene for a good

Re: Usefulness of Similarity.queryNorm()

2008-02-12 Thread Paul Elschot
Op Wednesday 13 February 2008 04:48:31 schreef Marvin Humphrey: ... > > > Heck, I'd love to eliminate ALL the automatic normalization code... if > only I could figure out what all the hidden side effects are. :( > > My goal is to de-voodoofy the Query-Weight-Scorer compilation phase so > th

Re: Usefulness of Similarity.queryNorm()

2008-02-13 Thread Chris Hostetter
: It's the *same* coefficient for all sub-clauses, so it shouldn't affect : rankings, BUT... relative rankings *will* be affected is some inner clauses : have custom boost values. For things like ConstantScoreQuery and BoostingQuery or any user created Query classes that try to return specific,

Re: Usefulness of Similarity.queryNorm()

2008-02-13 Thread Michele Bini
Chris Hostetter wrote: The tf(), idf(), lengthNorm() and queryNorm() are directly from the cosine measure, although lengthNorm()'s default implemenation uses an approximation. As I actually found normalized query scores quite useful I decided to exit my usual lurk-mode :) I integrated luce

Re: Usefulness of Similarity.queryNorm()

2009-09-07 Thread Mark Miller
I think I finally cracked this one: https://issues.apache.org/jira/browse/LUCENE-1896 The usefulness is as it says - to allow the scores from queries to be *more* comparable. The value of *more* is less clear. Not sure its so valuable myself. - mark --