[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985254#comment-13985254 ]
Da Huang commented on LUCENE-4396: ---------------------------------- Thanks for your suggestions, Mike. And sorry for my late reply. {quote} Hmm, the patch didn't cleanly apply, but I was able to work through it. I think your dev area is not up to date with trunk? {quote} I haven't merged my branch to the newest trunk version, because my network account at school for April has been run out and I couldn't pull the code from github untill 1 May. Sorry for that. {quote} Small code style things {quote} I'm very sorry for the code style. That's my fault. Very sorry for that. {quote} So it looks like BooleanNovelScorer is able to be a Scorer because the linked-list of visited buckets in one window are guaranteed to be in docID order, because we first visit the requiredConjunctionScorer's docs in that window. {quote} Yes, you're right. {quote} Have you tested performance when the .advance method here isn't called? Ie, just boolean queries w/ one MUST and one or more SHOULD? {quote} No, I haven't. Do you mean the .advance method of subScorers in BooleanNovelScorer? If so, I will do that. If you mean the .advance method of BooleanNovelScorer itself, I think it would be confusing, because BooleanNovelScorer now is used when there's at least one MUST clause, no matter whether it acts as a top scorer or not. Therefore, .advance() of BooleanNovelScorer must be called when BooleanNovelScorer acts as a non-top scorer. {quote} I think the important question here is whether/in what cases the BooleanNovelScorer approach beats BooleanScorer2 performance? {quote} Yes, you're right. But BooleanNovelScorer has not been totally finished, and the performance itself remans to be improved especially its .advance method. {quote} I realized LUCENE-4872 is related here, i.e. we should also sometimes use BooleanScorer for the minShouldMatch>1 case. {quote} Yes, I also notice that. :) I think this issue should be dealed with together. > BooleanScorer should sometimes be used for MUST clauses > ------------------------------------------------------- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Attachments: LUCENE-4396.patch, LUCENE-4396.patch > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 1000000 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org