[jira] [Commented] (LUCENE-9958) Performance regression when a minimum number of matching SHOULD clauses is required
[ https://issues.apache.org/jira/browse/LUCENE-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347412#comment-17347412 ] Adrien Grand commented on LUCENE-9958: -- To set expectations, some queries might still be slower than they were in older versions after this change. This is due to the fact that that BMW adds some overhead and might not always help skip enough documents to counter-balance this overhead. For instance here is the benchmark of a baseline that doesn't do BMW (by reverting LUCENE-9346) vs. main. Queries with a high number of required SHOULD clauses may still be slower. {noformat} TaskQPS baseline StdDev QPS patch StdDev Pct diff p-value MSM6 85.71 (8.7%) 50.79 (2.3%) -40.7% ( -47% - -32%) 0.000 MSM5 28.38 (6.9%) 23.18 (2.3%) -18.3% ( -25% - -9%) 0.000 MSM7 200.58 (3.9%) 199.28 (3.6%) -0.7% ( -7% -7%) 0.580 MSM1 20.38 (2.7%) 20.55 (2.7%) 0.8% ( -4% -6%) 0.351 PKLookup 231.96 (3.6%) 234.75 (3.6%) 1.2% ( -5% -8%) 0.292 MSM48.48 (6.7%) 20.54 (6.5%) 142.1% ( 120% - 166%) 0.000 MSM32.95 (6.0%) 20.52 (19.9%) 595.8% ( 537% - 661%) 0.000 MSM21.92 (3.6%) 20.59 (27.4%) 970.5% ( 907% - 1038%) 0.000 {noformat} > Performance regression when a minimum number of matching SHOULD clauses is > required > --- > > Key: LUCENE-9958 > URL: https://issues.apache.org/jira/browse/LUCENE-9958 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.9 > > > Opening this issue on behalf of [~mattweber], who reported this at > https://discuss.elastic.co/t/es-7-7-1-es-7-12-0-wand-performance-issue/272854. > It looks like the fact that we introduced dynamic pruning for queries that > already have a minimum number of SHOULD clauses configured makes things > _slower_, at least in some cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9958) Performance regression when a minimum number of matching SHOULD clauses is required
[ https://issues.apache.org/jira/browse/LUCENE-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344599#comment-17344599 ] Matt Weber commented on LUCENE-9958: [~jpountz] Wow that was quick! Thank you! > Performance regression when a minimum number of matching SHOULD clauses is > required > --- > > Key: LUCENE-9958 > URL: https://issues.apache.org/jira/browse/LUCENE-9958 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.9 > > > Opening this issue on behalf of [~mattweber], who reported this at > https://discuss.elastic.co/t/es-7-7-1-es-7-12-0-wand-performance-issue/272854. > It looks like the fact that we introduced dynamic pruning for queries that > already have a minimum number of SHOULD clauses configured makes things > _slower_, at least in some cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9958) Performance regression when a minimum number of matching SHOULD clauses is required
[ https://issues.apache.org/jira/browse/LUCENE-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344549#comment-17344549 ] ASF subversion and git services commented on LUCENE-9958: - Commit d50d5dec62b612b8d603d82d33044cfc97c02d91 in lucene-solr's branch refs/heads/branch_8x from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d50d5de ] LUCENE-9958: Fixed performance regression for boolean queries that configure a minimum number of matching clauses. > Performance regression when a minimum number of matching SHOULD clauses is > required > --- > > Key: LUCENE-9958 > URL: https://issues.apache.org/jira/browse/LUCENE-9958 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > Fix For: 8.9 > > > Opening this issue on behalf of [~mattweber], who reported this at > https://discuss.elastic.co/t/es-7-7-1-es-7-12-0-wand-performance-issue/272854. > It looks like the fact that we introduced dynamic pruning for queries that > already have a minimum number of SHOULD clauses configured makes things > _slower_, at least in some cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9958) Performance regression when a minimum number of matching SHOULD clauses is required
[ https://issues.apache.org/jira/browse/LUCENE-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344547#comment-17344547 ] ASF subversion and git services commented on LUCENE-9958: - Commit 2c04ab58353eb56d254b09ba075ff33e20e9d329 in lucene's branch refs/heads/main from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=2c04ab5 ] LUCENE-9958: Fixed performance regression for boolean queries that configure a minimum number of matching clauses. > Performance regression when a minimum number of matching SHOULD clauses is > required > --- > > Key: LUCENE-9958 > URL: https://issues.apache.org/jira/browse/LUCENE-9958 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > > Opening this issue on behalf of [~mattweber], who reported this at > https://discuss.elastic.co/t/es-7-7-1-es-7-12-0-wand-performance-issue/272854. > It looks like the fact that we introduced dynamic pruning for queries that > already have a minimum number of SHOULD clauses configured makes things > _slower_, at least in some cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9958) Performance regression when a minimum number of matching SHOULD clauses is required
[ https://issues.apache.org/jira/browse/LUCENE-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344544#comment-17344544 ] Adrien Grand commented on LUCENE-9958: -- The fix is embarrissingly simple. In short, WANDScorer would only accept to leave scorers behind if the sum of their score could not be competitive. However it is also ok to leave {{minShouldMatch-1}} scorers behind regardless of their score, since there cannot be a hit without at least {{minShouldMatch}} matching scorers regardless of their score. {code:java} diff --git a/lucene/core/src/java/org/apache/lucene/search/WANDScorer.java b/lucene/core/src/java/org/apache/lucene/search/WANDScorer.java index f33af6b8ee8..f5bab49fb71 100644 --- a/lucene/core/src/java/org/apache/lucene/search/WANDScorer.java +++ b/lucene/core/src/java/org/apache/lucene/search/WANDScorer.java @@ -548,7 +548,7 @@ final class WANDScorer extends Scorer { /** Insert an entry in 'tail' and evict the least-costly scorer if full. */ private DisiWrapper insertTailWithOverFlow(DisiWrapper s) { -if (tailMaxScore + s.maxScore < minCompetitiveScore) { +if (tailMaxScore + s.maxScore < minCompetitiveScore || tailSize + 1 < minShouldMatch) { // we have free room for this new entry addTail(s); tailMaxScore += s.maxScore; {code} Here are updated results from luceneutil where baseline is origing/main and the patch is the above 1-line change: {noformat} TaskQPS baseline StdDev QPS patch StdDev Pct diff p-value PKLookup 248.11 (4.1%) 235.92 (4.0%) -4.9% ( -12% -3%) 0.000 MSM7 203.98 (3.0%) 199.78 (4.2%) -2.1% ( -8% -5%) 0.075 MSM3 20.09 (3.0%) 20.34 (3.2%) 1.2% ( -4% -7%) 0.212 MSM1 20.15 (2.9%) 20.44 (3.5%) 1.4% ( -4% -8%) 0.162 MSM2 20.14 (3.0%) 20.44 (3.4%) 1.5% ( -4% -8%) 0.141 MSM4 18.93 (3.0%) 20.41 (3.7%) 7.8% ( 1% - 14%) 0.000 MSM55.11 (4.7%) 23.01 (17.2%) 350.1% ( 313% - 390%) 0.000 MSM62.32 (5.2%) 50.64 (92.0%) 2086.0% (1889% - 2304%) 0.000 {noformat} As we would usually expect, QPS now goes up as the minimum number of required clauses increases. > Performance regression when a minimum number of matching SHOULD clauses is > required > --- > > Key: LUCENE-9958 > URL: https://issues.apache.org/jira/browse/LUCENE-9958 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > > Opening this issue on behalf of [~mattweber], who reported this at > https://discuss.elastic.co/t/es-7-7-1-es-7-12-0-wand-performance-issue/272854. > It looks like the fact that we introduced dynamic pruning for queries that > already have a minimum number of SHOULD clauses configured makes things > _slower_, at least in some cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9958) Performance regression when a minimum number of matching SHOULD clauses is required
[ https://issues.apache.org/jira/browse/LUCENE-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344521#comment-17344521 ] Adrien Grand commented on LUCENE-9958: -- Good news is that it's easy to reproduce. Using the following tasks file {noformat} MSM1: ref http from mostly interview 9 hard MSM2: ref http from mostly interview 9 hard +minShouldMatch=2 MSM3: ref http from mostly interview 9 hard +minShouldMatch=3 MSM4: ref http from mostly interview 9 hard +minShouldMatch=4 MSM5: ref http from mostly interview 9 hard +minShouldMatch=5 MSM6: ref http from mostly interview 9 hard +minShouldMatch=6 MSM7: ref http from mostly interview 9 hard +minShouldMatch=7 {noformat} I got the following results on wikimedium10m where baseline is origin/main and the patch reverts LUCENE-9346: {noformat} TaskQPS baseline StdDev QPS patch StdDev Pct diff p-value PKLookup 248.06 (3.6%) 231.47 (4.3%) -6.7% ( -14% -1%) 0.000 MSM7 182.44 (3.8%) 181.65 (3.4%) -0.4% ( -7% -7%) 0.704 MSM1 19.52 (4.4%) 20.31 (3.8%) 4.1% ( -4% - 12%) 0.002 MSM23.27 (3.4%)4.20 (2.9%) 28.4% ( 21% - 35%) 0.000 MSM33.09 (4.6%)6.95 (4.9%) 125.0% ( 110% - 141%) 0.000 MSM42.29 (5.7%)9.85 (15.2%) 329.9% ( 292% - 371%) 0.000 MSM52.20 (5.8%) 29.48 (56.8%) 1240.2% (1113% - 1382%) 0.000 MSM62.21 (5.8%) 88.95(223.7%) 3929.4% (3497% - 4414%) 0.000 {noformat} > Performance regression when a minimum number of matching SHOULD clauses is > required > --- > > Key: LUCENE-9958 > URL: https://issues.apache.org/jira/browse/LUCENE-9958 > Project: Lucene - Core > Issue Type: Bug >Reporter: Adrien Grand >Priority: Minor > > Opening this issue on behalf of [~mattweber], who reported this at > https://discuss.elastic.co/t/es-7-7-1-es-7-12-0-wand-performance-issue/272854. > It looks like the fact that we introduced dynamic pruning for queries that > already have a minimum number of SHOULD clauses configured makes things > _slower_, at least in some cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org