Adrien Grand created LUCENE-6172:
------------------------------------
Summary: Improve the in-order / out-of-order collection decision
process
Key: LUCENE-6172
URL: https://issues.apache.org/jira/browse/LUCENE-6172
Project: Lucene - Core
Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
Fix For: 5.0, Trunk
Today the logic is the following:
- IndexSearcher looks if the weight can score out-of-order
- Depending on the value it creates the appropriate top docs/field collector
I think this has several issues:
- Only IndexSearcher can actually make the decision correctly, and it only
works for top docs/field collectors. If you want to make a multi collector in
order to have both facets and top docs, then you're clueless about whether you
should create a top docs collector that supports out-of-order collection
- It is quite fragile: you need to make sure that Weight.scoresDocsOutOfOrder
and Weight.bulkScorer agree on when they can score out-of-order. Some queries
like BooleanQuery duplicate the logic and other queries like FilteredQuery just
always return true to avoid complexity. This is inefficient as this means that
IndexSearcher will create a collector that supports out-of-order collection
while the common case actually scores documents in order (leap frog between the
query and the filter).
Instead I would like to take advantage of the new collection API to make
out-of-order scoring an implementation detail of the bulk scorers. My current
idea is as follows:
- remove Weight.scoresDocsOutOfOrder
- change Collector.getLeafCollector(LeafReaderContext) to
Collector.getLeafCollector(LeafReaderContext, boolean canScoreOutOfOrder)
This new boolean in Collector.getLeafCollector tells the collector that the
scorer supports out-of-order scoring. So by returning a leaf collector that
supports out-of-order collection, things will be faster.
The new logic would be the following. First Weights cannot tell whether they
support out-of-order scoring or not. However when a weight knows it supports
out-of-order scoring, it will pass canScoreOutOfOrder=true when getting the
leaf collector. If the returned collector accepts documents out of order, then
the weight will return an out-of order scorer. Otherwise, an in-order scorer is
returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]