[ 
https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937412#comment-13937412
 ] 

Da Huang edited comment on LUCENE-4396 at 3/17/14 2:14 AM:
-----------------------------------------------------------

I'm revising and polishing my proposal these days, and I have discovered an 
interesting thing. That is: if BooleanScorer supports required scorers in the 
way I have proposed, docIDs would be in acsending order in the bucket table. I 
think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() 
.nextDoc() etc. can be implemented. However, I'm not sure how it would affect 
the performance when it acts as a Not-Top Scorer. This is because when 
.nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window 
whose data may not be all useful.

I hope I have made my idea clear.


was (Author: dhuang):
I'm revising and polishing my proposal these days, and I have discovered a 
interesting thing. That is: if BooleanScorer supports required scorers in the 
way I have proposed, docIDs would be in acsending order in the bucket table. I 
think this can make BooleanScorer be a Not-Top Scorer, as .advance() .docID() 
.nextDoc() etc. can be implemented. However, I'm not sure how it would affect 
the performance when it acts as a Not-Top Scorer. This is because when 
.nextDoc() or .advance() is called, BooleanScorer may calculate a 2K window 
whose data may not be all useful.

I hope I have made my idea clear.

> BooleanScorer should sometimes be used for MUST clauses
> -------------------------------------------------------
>
>                 Key: LUCENE-4396
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4396
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>
> Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT.
> If there is one or more MUST clauses we always use BooleanScorer2.
> But I suspect that unless the MUST clauses have very low hit count compared 
> to the other clauses, that BooleanScorer would perform better than 
> BooleanScorer2.  BooleanScorer still has some vestiges from when it used to 
> handle MUST so it shouldn't be hard to bring back this capability ... I think 
> the challenging part might be the heuristics on when to use which (likely we 
> would have to use firstDocID as proxy for total hit count).
> Likely we should also have BooleanScorer sometimes use .advance() on the subs 
> in this case, eg if suddenly the MUST clause skips 1000000 docs then you want 
> to .advance() all the SHOULD clauses.
> I won't have near term time to work on this so feel free to take it if you 
> are inspired!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to