[ 
https://issues.apache.org/jira/browse/LUCENE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953518#comment-14953518
 ] 

Paul Elschot commented on LUCENE-6276:
--------------------------------------

bq.  it would make more sense to sum up totalTermFreq/docFreq for each term 

I'll change that and change the matchCost() method to return a float instead of 
a long.

bq. TermStatistics.totalTermFreq() may return -1

I'll add a check for that.

bq. what definition we should give to matchCost()

I'd like to have it reflect an avarage cost to process a single document, once 
the two phase iterator is at the document.
That would exclude the cost for next() and advance(), which would be better in 
the DISI.cost() method for now.

How much of the cost of matches() should be in there I don't know, we'll see. 
NearSpans also does work after matches() returns true.

And the likelyhood of match is the probability that matches() returns true...

> Add matchCost() api to TwoPhaseDocIdSetIterator
> -----------------------------------------------
>
>                 Key: LUCENE-6276
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6276
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>         Attachments: LUCENE-6276-ExactPhraseOnly.patch
>
>
> We could add a method like TwoPhaseDISI.matchCost() defined as something like 
> estimate of nanoseconds or similar. 
> ConjunctionScorer could use this method to sort its 'twoPhaseIterators' array 
> so that cheaper ones are called first. Today it has no idea if one scorer is 
> a simple phrase scorer on a short field vs another that might do some geo 
> calculation or more expensive stuff.
> PhraseScorers could implement this based on index statistics (e.g. 
> totalTermFreq/maxDoc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to