On Nov 20, 2009, at 1:24 PM, Jake Mannix wrote:

> 
> On Fri, Nov 20, 2009 at 10:08 AM, Grant Ingersoll <gsing...@apache.org> wrote:
>> I should add in my $0.02 on whether to just get rid of queryNorm() 
>> altogether: 
>> 
>>   -1 from me, even though it's confusing, because having that call there 
>> (somewhere, at least) allows you to actually do compare scores across 
>> queries if you do the extra work of properly normalizing the documents as 
>> well (at index time).
> 
> Do you have some references on this?  I'm interested in reading more on the 
> subject.  I've never quite been sold on how it is meaningful to compare 
> scores and would like to read more opinions.
>  
> References on how people do this *with Lucene*, or just how this is done in 
> general? 

in general.  Academic references, etc.

> There are lots of papers on fancy things which can be done, but I'm not sure 
> where to point you to start out.  The technique I'm referring to is really 
> just the simplest possible thing beyond setting your weights "by hand": let's 
> assume you have a boolean OR query, Q, built up out of sub-queries q_i 
> (hitting, for starters, different fields, although you can overlap as well 
> with some more work), each with a set of weights (boosts) b_i, then if you 
> have a training corpus (good matches, bad matches, or ranked lists of matches 
> in order of relevance for the queries at hand), *and* scores (at the q_i 
> level) which are comparable, then you can do a simple regression (linear or 
> logistic, depending on whether you map your final scores to a logit or not) 
> on the w_i to fit for the best boosts to use.  What is critical here is that 
> scores from different queries are comparable.  If they're not, then queries 
> where the best document for a query scores 2.0 overly affect the training in 
> comparison to the queries where the best possible score is 0.5 (actually, 
> wait, it's the reverse: you're training to increase scores of matching 
> documents, so the system tries to make that 0.5 scoring document score much 
> higher by raising boosts higher and higher, while the good matches already 
> scoring 2.0 don't need any more boosting, if that makes sense).
> 

This makes sense from a mathematical sense, assuming scores are comparable.  
What I would like to get at is why anyone thinks scores are comparable across 
queries to begin with.  I agree it is beneficial in some cases (as you 
described) if they are.   Probably a question suited for an academic IR list...

> There are of course far more complex "state of the art" training techniques, 
> but probably someone like Ted would be able to give a better list of 
> references on where is easiest to read those from.  But I can try to dredge 
> up some places where I've read about doing this, and post again later if I 
> can find any.
> 


Reply via email to