[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098565#comment-14098565 ]
Da Huang commented on LUCENE-4396: ---------------------------------- Thanks for your suggestions, Mike! {quote} I'm worried about how BooleanWeight.bulkScorer first pulls BulkScorer for the clauses, and then sometimes also pulls Scorer; pulling a Scorer is not that cheap an operation in general. {quote} My current plan is to break from the first weights iteration when it comes to a required scorer. In this way, I'm sure that the times it pulls scorers is exactly the same as the trunk does. {quote} Maybe if we added .cost() to bulk scorer we could avoid that? {quote} I don't think so. When the logics choose DAAT but not BS, it has to wrap up to super.bulkScorer() and pulls all scorers again. {quote} Or maybe we could look at the BulkScorer, and if it's a DefaultBulkScorer, just ask it for the Scorer it wrapped? {quote} This way may make it embarrassed when it's not a DefaultBulkScorer. but not sure. I will have a try. > BooleanScorer should sometimes be used for MUST clauses > ------------------------------------------------------- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, > LUCENE-4396-simple.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, > luceneutil-score-equal.patch, luceneutil-score-equal.patch, > merge-simple.perf, merge-simple.png, merge.perf, merge.png, perf.png, > stat.cpp, stat.cpp, tasks.cpp > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 1000000 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org