Re: highlight - scoring fragments with more of the same token

markharw00d Tue, 26 Sep 2006 15:30:34 -0700

I was somewhat surprised to find that highlighting scoring simply counts
how many unique query terms appear in the fragment. Guess was expecting a


See QueryScorer(Query query, IndexReader reader, String fieldName) constructor 
- this will factor IDF into weighting for terms. Query boosts are automatically 
factored in too.
TF is not a factor in fragment scores because I found its typically more useful 
to look for fragments containing a strong mix of the query terms - not merely 
repetitions of the same term. The idea is the choice of scorer is pluggable if 
you don't like the default behaviour.

The possibility of adding smarter fragmenting is also enabled by the interface for 
Fragmenter - no "smarter" alternatives to the simple one have been implemented 
as yet though (as far as I am aware).

Cheers
Mark

___________________________________________________________Win a BlackBerry device from O2 with Yahoo!. Enter now. http://www.yahoo.co.uk/blackberry



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: highlight - scoring fragments with more of the same token

Reply via email to