Alan Woodward created LUCENE-8633:
-------------------------------------

             Summary: Remove term weighting from interval scoring
                 Key: LUCENE-8633
                 URL: https://issues.apache.org/jira/browse/LUCENE-8633
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Alan Woodward
            Assignee: Alan Woodward
         Attachments: LUCENE-8633.patch

IntervalScorer currently uses the same scoring mechanism as SpanScorer, summing 
the IDF of all possibly matching terms from its parent IntervalsSource and 
using that in conjunction with a sloppy frequency to produce a similarity-based 
score.  This doesn't really make sense, however, as it means that terms that 
don't appear in a document can still contribute to the score, and appears to 
make scores from interval queries comparable with scores from term or phrase 
queries when they really aren't.

I'd like to explore a different scoring mechanism for intervals, based purely 
on sloppy frequency and ignoring term weighting.  This should make the scores 
easier to reason about, as well as making them useful for things like proximity 
boosting on boolean queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to