[ https://issues.apache.org/jira/browse/LUCENE-8796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16852876#comment-16852876 ]
Luca Cavanna edited comment on LUCENE-8796 at 5/31/19 10:08 AM: ---------------------------------------------------------------- I updated the PR and addressed all the comments, here are the latest benchmark results (with bitset optimization disabled on both ends): {noformat} Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff MedTerm 1510.74 (6.8%) 1457.20 (8.4%) -3.5% ( -17% - 12%) Fuzzy1 70.49 (8.5%) 68.11 (9.8%) -3.4% ( -19% - 16%) OrHighNotMed 650.57 (5.8%) 629.81 (6.0%) -3.2% ( -14% - 9%) OrHighLow 447.13 (4.2%) 433.05 (4.5%) -3.2% ( -11% - 5%) OrNotHighMed 623.22 (6.3%) 605.19 (6.1%) -2.9% ( -14% - 10%) OrHighNotLow 720.89 (7.0%) 701.26 (7.9%) -2.7% ( -16% - 13%) OrNotHighHigh 558.43 (6.3%) 544.82 (4.9%) -2.4% ( -12% - 9%) LowTerm 1279.34 (4.9%) 1248.60 (5.2%) -2.4% ( -11% - 8%) AndHighLow 690.75 (4.0%) 675.22 (5.3%) -2.2% ( -11% - 7%) LowPhrase 358.90 (2.3%) 351.28 (4.0%) -2.1% ( -8% - 4%) PKLookup 139.97 (3.0%) 137.32 (3.5%) -1.9% ( -8% - 4%) OrNotHighLow 728.48 (6.8%) 714.79 (6.5%) -1.9% ( -14% - 12%) HighTerm 1222.38 (6.3%) 1199.77 (7.1%) -1.8% ( -14% - 12%) AndHighHigh 58.93 (6.2%) 58.01 (5.8%) -1.6% ( -12% - 11%) Prefix3 152.21 (4.5%) 150.00 (5.0%) -1.5% ( -10% - 8%) IntNRQConjMedTerm 79.15 (10.7%) 78.06 (10.5%) -1.4% ( -20% - 22%) HighTermDayOfYearSort 95.28 (5.1%) 94.10 (7.8%) -1.2% ( -13% - 12%) Wildcard 64.23 (2.3%) 63.45 (2.3%) -1.2% ( -5% - 3%) MedSpanNear 81.15 (2.2%) 80.19 (2.8%) -1.2% ( -6% - 3%) HighSpanNear 10.20 (3.9%) 10.08 (4.2%) -1.2% ( -8% - 7%) HighIntervalsOrdered 4.07 (1.8%) 4.03 (2.2%) -1.1% ( -4% - 2%) LowSpanNear 41.62 (3.1%) 41.20 (3.6%) -1.0% ( -7% - 5%) IntNRQConjLowTerm 20.36 (4.1%) 20.15 (4.5%) -1.0% ( -9% - 7%) IntNRQConjHighTerm 64.84 (9.6%) 64.21 (9.4%) -1.0% ( -18% - 19%) AndHighMed 229.08 (2.8%) 227.00 (2.5%) -0.9% ( -6% - 4%) MedPhrase 18.73 (1.5%) 18.57 (2.3%) -0.8% ( -4% - 2%) LowSloppyPhrase 124.52 (2.3%) 123.48 (2.6%) -0.8% ( -5% - 4%) Respell 69.26 (3.0%) 68.68 (2.9%) -0.8% ( -6% - 5%) HighPhrase 12.98 (1.6%) 12.88 (2.2%) -0.7% ( -4% - 3%) PrefixConjLowTerm 42.11 (2.6%) 41.81 (3.0%) -0.7% ( -6% - 5%) OrHighNotHigh 680.34 (6.1%) 676.16 (7.6%) -0.6% ( -13% - 13%) MedSloppyPhrase 34.06 (4.9%) 33.89 (4.5%) -0.5% ( -9% - 9%) IntNRQ 89.97 (12.4%) 89.62 (12.0%) -0.4% ( -22% - 27%) HighSloppyPhrase 8.28 (4.0%) 8.25 (3.9%) -0.3% ( -7% - 7%) WildcardConjLowTerm 36.35 (2.7%) 36.26 (2.7%) -0.3% ( -5% - 5%) OrHighHigh 27.89 (2.6%) 27.85 (3.1%) -0.1% ( -5% - 5%) Fuzzy2 44.19 (3.8%) 44.17 (3.1%) -0.1% ( -6% - 7%) OrHighMed 90.42 (2.8%) 90.57 (2.8%) 0.2% ( -5% - 6%) PrefixConjMedTerm 45.56 (2.8%) 45.79 (2.9%) 0.5% ( -5% - 6%) WildcardConjHighTerm 33.08 (2.6%) 33.47 (3.0%) 1.2% ( -4% - 6%) PrefixConjHighTerm 83.65 (2.6%) 86.23 (3.7%) 3.1% ( -3% - 9%) HighTermMonthSort 130.35 (15.8%) 135.08 (12.1%) 3.6% ( -20% - 37%) WildcardConjMedTerm 99.19 (3.6%) 103.37 (4.1%) 4.2% ( -3% - 12%) {noformat} was (Author: lucacavanna): I updated the PR and addressed all the comments, here are the latest benchmark results: {noformat} Report after iter 19: TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff MedTerm 1510.74 (6.8%) 1457.20 (8.4%) -3.5% ( -17% - 12%) Fuzzy1 70.49 (8.5%) 68.11 (9.8%) -3.4% ( -19% - 16%) OrHighNotMed 650.57 (5.8%) 629.81 (6.0%) -3.2% ( -14% - 9%) OrHighLow 447.13 (4.2%) 433.05 (4.5%) -3.2% ( -11% - 5%) OrNotHighMed 623.22 (6.3%) 605.19 (6.1%) -2.9% ( -14% - 10%) OrHighNotLow 720.89 (7.0%) 701.26 (7.9%) -2.7% ( -16% - 13%) OrNotHighHigh 558.43 (6.3%) 544.82 (4.9%) -2.4% ( -12% - 9%) LowTerm 1279.34 (4.9%) 1248.60 (5.2%) -2.4% ( -11% - 8%) AndHighLow 690.75 (4.0%) 675.22 (5.3%) -2.2% ( -11% - 7%) LowPhrase 358.90 (2.3%) 351.28 (4.0%) -2.1% ( -8% - 4%) PKLookup 139.97 (3.0%) 137.32 (3.5%) -1.9% ( -8% - 4%) OrNotHighLow 728.48 (6.8%) 714.79 (6.5%) -1.9% ( -14% - 12%) HighTerm 1222.38 (6.3%) 1199.77 (7.1%) -1.8% ( -14% - 12%) AndHighHigh 58.93 (6.2%) 58.01 (5.8%) -1.6% ( -12% - 11%) Prefix3 152.21 (4.5%) 150.00 (5.0%) -1.5% ( -10% - 8%) IntNRQConjMedTerm 79.15 (10.7%) 78.06 (10.5%) -1.4% ( -20% - 22%) HighTermDayOfYearSort 95.28 (5.1%) 94.10 (7.8%) -1.2% ( -13% - 12%) Wildcard 64.23 (2.3%) 63.45 (2.3%) -1.2% ( -5% - 3%) MedSpanNear 81.15 (2.2%) 80.19 (2.8%) -1.2% ( -6% - 3%) HighSpanNear 10.20 (3.9%) 10.08 (4.2%) -1.2% ( -8% - 7%) HighIntervalsOrdered 4.07 (1.8%) 4.03 (2.2%) -1.1% ( -4% - 2%) LowSpanNear 41.62 (3.1%) 41.20 (3.6%) -1.0% ( -7% - 5%) IntNRQConjLowTerm 20.36 (4.1%) 20.15 (4.5%) -1.0% ( -9% - 7%) IntNRQConjHighTerm 64.84 (9.6%) 64.21 (9.4%) -1.0% ( -18% - 19%) AndHighMed 229.08 (2.8%) 227.00 (2.5%) -0.9% ( -6% - 4%) MedPhrase 18.73 (1.5%) 18.57 (2.3%) -0.8% ( -4% - 2%) LowSloppyPhrase 124.52 (2.3%) 123.48 (2.6%) -0.8% ( -5% - 4%) Respell 69.26 (3.0%) 68.68 (2.9%) -0.8% ( -6% - 5%) HighPhrase 12.98 (1.6%) 12.88 (2.2%) -0.7% ( -4% - 3%) PrefixConjLowTerm 42.11 (2.6%) 41.81 (3.0%) -0.7% ( -6% - 5%) OrHighNotHigh 680.34 (6.1%) 676.16 (7.6%) -0.6% ( -13% - 13%) MedSloppyPhrase 34.06 (4.9%) 33.89 (4.5%) -0.5% ( -9% - 9%) IntNRQ 89.97 (12.4%) 89.62 (12.0%) -0.4% ( -22% - 27%) HighSloppyPhrase 8.28 (4.0%) 8.25 (3.9%) -0.3% ( -7% - 7%) WildcardConjLowTerm 36.35 (2.7%) 36.26 (2.7%) -0.3% ( -5% - 5%) OrHighHigh 27.89 (2.6%) 27.85 (3.1%) -0.1% ( -5% - 5%) Fuzzy2 44.19 (3.8%) 44.17 (3.1%) -0.1% ( -6% - 7%) OrHighMed 90.42 (2.8%) 90.57 (2.8%) 0.2% ( -5% - 6%) PrefixConjMedTerm 45.56 (2.8%) 45.79 (2.9%) 0.5% ( -5% - 6%) WildcardConjHighTerm 33.08 (2.6%) 33.47 (3.0%) 1.2% ( -4% - 6%) PrefixConjHighTerm 83.65 (2.6%) 86.23 (3.7%) 3.1% ( -3% - 9%) HighTermMonthSort 130.35 (15.8%) 135.08 (12.1%) 3.6% ( -20% - 37%) WildcardConjMedTerm 99.19 (3.6%) 103.37 (4.1%) 4.2% ( -3% - 12%) {noformat} > Use exponential search in IntArrayDocIdSet advance method > --------------------------------------------------------- > > Key: LUCENE-8796 > URL: https://issues.apache.org/jira/browse/LUCENE-8796 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Luca Cavanna > Priority: Minor > > Chatting with [~jpountz] , he suggested to improve IntArrayDocIdSet by making > its advance method use exponential search instead of binary search. This > should help performance of queries including conjunctions: given that > ConjunctionDISI uses leap frog, it advances through doc ids in small steps, > hence exponential search should be faster when advancing on average compared > to binary search. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org