[jira] [Commented] (LUCENE-6276) Add matchCost() api to TwoPhaseDocIdSetIterator

Paul Elschot (JIRA) Sun, 27 Sep 2015 03:59:27 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909685#comment-14909685
 ]


Paul Elschot commented on LUCENE-6276:
--------------------------------------

As to TwoPhaseIterator or DocIdSetIterator, I think this boils down to whether 
the leading iterator in ConjunctionDISI should be chosen using the expected 
number of matching docs only, or also using the totalTermFreq's somehow. This 
is for more complex queries, for example a conjunction with at least one phrase 
or SpanNearQuery.

But for the more complex queries two phase approximation is already in place, 
so having matchCost() only in the two phase code could be enough even for these 
queries.


> Add matchCost() api to TwoPhaseDocIdSetIterator
> -----------------------------------------------
>
>                 Key: LUCENE-6276
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6276
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>
> We could add a method like TwoPhaseDISI.matchCost() defined as something like 
> estimate of nanoseconds or similar. 
> ConjunctionScorer could use this method to sort its 'twoPhaseIterators' array 
> so that cheaper ones are called first. Today it has no idea if one scorer is 
> a simple phrase scorer on a short field vs another that might do some geo 
> calculation or more expensive stuff.
> PhraseScorers could implement this based on index statistics (e.g. 
> totalTermFreq/maxDoc)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6276) Add matchCost() api to TwoPhaseDocIdSetIterator

Reply via email to