Now that Otis reminded me that this thread existed (I've got a brain like a sieve these days, I tell you)...
On Fri, Nov 20, 2009 at 10:08 AM, Grant Ingersoll <gsing...@apache.org>wrote: > > -1 from me, even though it's confusing, because having that call there > (somewhere, at least) allows you to actually do compare scores across > queries if you do the extra work of properly normalizing the documents as > well (at index time). > > > Do you have some references on this? I'm interested in reading more on the > subject. I've never quite been sold on how it is meaningful to compare > scores and would like to read more opinions. > So I couldn't find any really good papers on this specifically, but I seem to remember seeing this stuff done a lot in Manning and Schutze' IR book - the go over training field boosts with logistic regression and all that, but they don't specifically look at the Lucene case (although they consider similar scoring functions). They must talk about the necessity of comparable scores to do this, I'm sure. -jake