Adrien Grand created LUCENE-6172:
------------------------------------

             Summary: Improve the in-order / out-of-order collection decision 
process
                 Key: LUCENE-6172
                 URL: https://issues.apache.org/jira/browse/LUCENE-6172
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Adrien Grand
            Assignee: Adrien Grand
            Priority: Minor
             Fix For: 5.0, Trunk


Today the logic is the following:

 - IndexSearcher looks if the weight can score out-of-order
 - Depending on the value it creates the appropriate top docs/field collector

I think this has several issues:
 - Only IndexSearcher can actually make the decision correctly, and it only 
works for top docs/field collectors. If you want to make a multi collector in 
order to have both facets and top docs, then you're clueless about whether you 
should create a top docs collector that supports out-of-order collection
 - It is quite fragile: you need to make sure that Weight.scoresDocsOutOfOrder 
and Weight.bulkScorer agree on when they can score out-of-order. Some queries 
like BooleanQuery duplicate the logic and other queries like FilteredQuery just 
always return true to avoid complexity. This is inefficient as this means that 
IndexSearcher will create a collector that supports out-of-order collection 
while the common case actually scores documents in order (leap frog between the 
query and the filter).

Instead I would like to take advantage of the new collection API to make 
out-of-order scoring an implementation detail of the bulk scorers. My current 
idea is as follows:
 - remove Weight.scoresDocsOutOfOrder
 - change Collector.getLeafCollector(LeafReaderContext) to 
Collector.getLeafCollector(LeafReaderContext, boolean canScoreOutOfOrder)

This new boolean in Collector.getLeafCollector tells the collector that the 
scorer supports out-of-order scoring. So by returning a leaf collector that 
supports out-of-order collection, things will be faster.

The new logic would be the following. First Weights cannot tell whether they 
support out-of-order scoring or not. However when a weight knows it supports 
out-of-order scoring, it will pass canScoreOutOfOrder=true when getting the 
leaf collector. If the returned collector accepts documents out of order, then 
the weight will return an out-of order scorer. Otherwise, an in-order scorer is 
returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to