[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970091#comment-13970091 ]
Michael McCandless commented on LUCENE-4396: -------------------------------------------- This looks great! Tests pass for me. Changing from static so you can access requiredNrMatchers seems fine; we could also pass that into the class & save it. Either way ... The comment for REQUIRED_MASK needs fixing. Maybe add a comment where we return null if requiredSubScorer is null explaining why? Hmm that pre-existing comment "// Check if we can and should return a BooleanScorer" is wrong ... I'll fix. Small styling issue: we don't put whitespace after ( and before ), e.g.: {noformat} if ( this.requiredNrMatchers > 0 ) { {noformat} should be this instead: {noformat} if (this.requiredNrMatchers > 0) { {noformat} Maybe change: {noformat} if (current.coord >= minNrShouldMatch + requiredNrMatchers) { {noformat} to: {noformat} if (current.coord - requiredNrMatchers >= minNrShouldMatch) { {noformat} And add a comment saying "minNrShouldMatch only applies to SHOULD clauses", or something? Just to make the math more obvious :) bq. For consistency, I should probably change the argument from "List<Scorer> requiredScorers" to "List<BulkScorer> requiredScorers", but, as a result, getScorer method should be added to BulkScorer. Hmm in general a BulkScorer need not implement a Scorer under the hood (DefaultBulkScorer does, because it wraps, but e.g. BooleanScorer doesn't). Or alternatively if you pass List<BulkScorer> you could handle all the conjunctions yourself in BooleanScorer, e.g. make different collectors for them that add up a "mustClauseCountMatches" counter (instead of setting the 1 bit mask), and then check if that counts is >= requiredScorers.size() before counting it as a hit... > BooleanScorer should sometimes be used for MUST clauses > ------------------------------------------------------- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Attachments: LUCENE-4396.patch > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 1000000 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org