[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937412#comment-13937412 ]
Da Huang edited comment on LUCENE-4396 at 3/17/14 2:14 AM: ----------------------------------------------------------- I'm revising and polishing my proposal these days, and I have discovered an interesting thing. That is: if BooleanScorer supports required scorers in the way I have proposed, docIDs would be in acsending order in the bucket table. I think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() .nextDoc() etc. can be implemented. However, I'm not sure how it would affect the performance when it acts as a Not-Top Scorer. This is because when .nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window whose data may not be all useful. I hope I have made my idea clear. was (Author: dhuang): I'm revising and polishing my proposal these days, and I have discovered a interesting thing. That is: if BooleanScorer supports required scorers in the way I have proposed, docIDs would be in acsending order in the bucket table. I think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() .nextDoc() etc. can be implemented. However, I'm not sure how it would affect the performance when it acts as a Not-Top Scorer. This is because when .nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window whose data may not be all useful. I hope I have made my idea clear. > BooleanScorer should sometimes be used for MUST clauses > ------------------------------------------------------- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 1000000 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org