[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994363#comment-13994363 ]
Da Huang edited comment on LUCENE-4396 at 5/11/14 12:56 AM: ------------------------------------------------------------ luceneutil tasks file to test queries like "+a b c d e ..." The performance shows as follows. || TaskQPS || baseline || StdDevQPS || my_modified_version || StdDev || Pct diff || | HighAndManyLowOr | 8.50 | (3.3%) | 1.72 | (0.3%) | -79.8% ( -80% - -78%) | | PKLookup | 239.75 | (0.9%) | 239.99 | (0.9%) | 0.1% ( -1% - 1%) | | LowAndManyHighOr | 7.11 | (1.4%) | 7.76 | (1.4%) | 9.1% ( 6% - 12%) | | LowAndManyLowOr | 33.83 | (0.7%) | 41.03 | (2.7%) | 21.3% ( 17% - 24%) | | HighAndManyHighOr | 0.12 | (0.7%) | 0.29 | (7.8%) | 148.0% ( 138% - 157%) | was (Author: dhuang): luceneutil tasks file to test queries like "+a b c d e ..." The performance shows as follows. TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff HighAndManyLowOr 8.50 (3.3%) 1.72 (0.3%) -79.8% ( -80% - -78%) PKLookup 239.75 (0.9%) 239.99 (0.9%) 0.1% ( -1% - 1%) LowAndManyHighOr 7.11 (1.4%) 7.76 (1.4%) 9.1% ( 6% - 12%) LowAndManyLowOr 33.83 (0.7%) 41.03 (2.7%) 21.3% ( 17% - 24%) HighAndManyHighOr 0.12 (0.7%) 0.29 (7.8%) 148.0% ( 138% - 157%) > BooleanScorer should sometimes be used for MUST clauses > ------------------------------------------------------- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Attachments: AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 1000000 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org