[
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098565#comment-14098565
]
Da Huang commented on LUCENE-4396:
----------------------------------
Thanks for your suggestions, Mike!
{quote}
I'm worried about how BooleanWeight.bulkScorer first pulls BulkScorer
for the clauses, and then sometimes also pulls Scorer; pulling a
Scorer is not that cheap an operation in general.
{quote}
My current plan is to break from the first weights iteration when it comes to a
required scorer.
In this way, I'm sure that the times it pulls scorers is exactly the same as
the trunk does.
{quote}
Maybe if we added .cost() to bulk scorer we could avoid that?
{quote}
I don't think so. When the logics choose DAAT but not BS,
it has to wrap up to super.bulkScorer() and pulls all scorers again.
{quote}
Or maybe we could look at the BulkScorer, and if it's a DefaultBulkScorer,
just ask it for the Scorer it wrapped?
{quote}
This way may make it embarrassed when it's not a DefaultBulkScorer. but not
sure.
I will have a try.
> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
> Key: LUCENE-4396
> URL: https://issues.apache.org/jira/browse/LUCENE-4396
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks,
> LUCENE-4396-simple.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch,
> LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf,
> luceneutil-score-equal.patch, luceneutil-score-equal.patch,
> merge-simple.perf, merge-simple.png, merge.perf, merge.png, perf.png,
> stat.cpp, stat.cpp, tasks.cpp
>
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared
> to the other clauses, that BooleanScorer would perform better than
> BooleanScorer2. BooleanScorer still has some vestiges from when it used to
> handle MUST so it shouldn't be hard to bring back this capability ... I think
> the challenging part might be the heuristics on when to use which (likely we
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you
> are inspired!
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]