it's not possible to access sub-query's freq information if BooleanScorer is use
--------------------------------------------------------------------------------
Key: LUCENE-2684
URL: https://issues.apache.org/jira/browse/LUCENE-2684
Project: Lucene - Java
Issue Type: Bug
Components: Search
Reporter: Michael McCandless
Fix For: 3.1, 4.0
LUCENE-2590 added an advanced feature, allowing an app to gather all
sub-scorers for any Query.
This is powerful because then, during collection, the app can get some details
about how each sub-query "participated" in the overall match for the given
document.
However, I think this is completely broken if the BooleanQuery uses
BooleanScorer, because that scorer is not doc-at-once. Instead, it batch
processes chunks of 2048 sequential docIDs per scorer. This is a big
performance gain, but it means that the sub scorers will all be positioned to
the end of the 2048 doc chunk while the docs that matched within that chunk are
collected.
I don't think we can easily fix this... likely the "fix" is to make it
easy(ier) to force BQ to use BooleanScorer2 (which is doc-at-once)? It is
actually possible to force this, today, by having your collector return false
from acceptDocsOutOfOrder...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]