[GitHub] [lucene] jpountz commented on pull request #12490: Reduce the overhead of ImpactsDISI.

via GitHub Sat, 05 Aug 2023 13:27:50 -0700


jpountz commented on PR #12490:
URL: https://github.com/apache/lucene/pull/12490#issuecomment-1666598365


   Opened this PR as a draft to get feedback on the API (if any). Existing 
tests pass, but I plan on adding more tests before merging as well. Here are 
the results of this PR on wikimedium10m. Top-k queries on disjunctions and 
conjunctions get a significant performance boost by removing this overhead.
   
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                       OrHighNotMed      416.61      (4.8%)      399.18      
(5.6%)   -4.2% ( -13% -    6%) 0.074
                       OrHighNotLow      534.35      (4.8%)      512.73      
(5.7%)   -4.0% ( -13% -    6%) 0.087
                      OrHighNotHigh      340.89      (4.4%)      327.66      
(5.1%)   -3.9% ( -12% -    5%) 0.069
                      OrNotHighHigh      395.37      (3.9%)      380.50      
(4.7%)   -3.8% ( -11% -    5%) 0.051
                       OrNotHighMed      511.30      (1.9%)      497.11      
(3.7%)   -2.8% (  -8% -    2%) 0.034
                           HighTerm      597.37      (6.3%)      580.82      
(4.1%)   -2.8% ( -12% -    8%) 0.244
                            MedTerm      802.79      (5.3%)      783.67      
(3.4%)   -2.4% ( -10% -    6%) 0.235
                            LowTerm     1092.20      (4.6%)     1069.37      
(4.6%)   -2.1% ( -10% -    7%) 0.306
                             IntNRQ       61.95     (14.7%)       61.33     
(16.1%)   -1.0% ( -27% -   34%) 0.883
                  HighTermMonthSort     5037.79      (3.0%)     4992.75      
(2.8%)   -0.9% (  -6% -    5%) 0.494
                MedIntervalsOrdered       23.71      (3.8%)       23.55      
(7.2%)   -0.7% ( -11% -   10%) 0.793
                           PKLookup      247.34      (3.2%)      245.97      
(3.0%)   -0.6% (  -6% -    5%) 0.688
                  HighTermTitleSort      177.34      (4.9%)      176.39      
(3.9%)   -0.5% (  -8% -    8%) 0.785
                LowIntervalsOrdered       35.58      (3.2%)       35.43      
(5.2%)   -0.4% (  -8% -    8%) 0.829
                       OrNotHighLow     1495.12      (2.1%)     1489.60      
(2.4%)   -0.4% (  -4% -    4%) 0.718
              HighTermDayOfYearSort      407.38      (1.4%)      405.96      
(1.1%)   -0.4% (  -2% -    2%) 0.537
                          LowPhrase       68.81      (2.4%)       68.66      
(1.3%)   -0.2% (  -3% -    3%) 0.796
                            Respell       85.14      (1.2%)       85.05      
(2.1%)   -0.1% (  -3% -    3%) 0.884
                        MedSpanNear       28.29      (2.6%)       28.27      
(3.3%)   -0.1% (  -5% -    5%) 0.944
                           Wildcard      157.04      (3.0%)      157.00      
(3.5%)   -0.0% (  -6% -    6%) 0.987
                         TermDTSort      194.42      (2.2%)      194.53      
(1.1%)    0.1% (  -3% -    3%) 0.943
                        LowSpanNear       84.26      (2.5%)       84.32      
(3.0%)    0.1% (  -5% -    5%) 0.950
                         HighPhrase       42.16      (3.0%)       42.23      
(2.2%)    0.2% (  -4% -    5%) 0.892
                          MedPhrase      132.52      (2.5%)      132.78      
(1.4%)    0.2% (  -3% -    4%) 0.827
                       HighSpanNear       21.37      (3.7%)       21.44      
(4.5%)    0.3% (  -7% -    8%) 0.857
                    LowSloppyPhrase       64.26      (2.1%)       64.62      
(2.4%)    0.6% (  -3% -    5%) 0.575
               HighTermTitleBDVSort       21.59      (1.6%)       21.79      
(2.3%)    0.9% (  -2% -    4%) 0.281
                            Prefix3      299.62      (2.8%)      302.58      
(3.9%)    1.0% (  -5% -    7%) 0.517
                    MedSloppyPhrase       73.65      (3.0%)       74.38      
(3.7%)    1.0% (  -5% -    7%) 0.503
               HighIntervalsOrdered        6.22      (4.5%)        6.30      
(4.7%)    1.2% (  -7% -   10%) 0.565
                   HighSloppyPhrase       11.09      (3.7%)       11.22      
(3.4%)    1.2% (  -5% -    8%) 0.449
                         AndHighLow     1246.88      (2.8%)     1262.41      
(2.7%)    1.2% (  -4% -    6%) 0.311
                             Fuzzy2       86.08      (1.1%)       87.67      
(1.2%)    1.9% (   0% -    4%) 0.000
                             Fuzzy1      121.95      (1.0%)      124.27      
(1.3%)    1.9% (   0% -    4%) 0.000
                         AndHighMed      199.47      (5.3%)      234.03      
(3.0%)   17.3% (   8% -   27%) 0.000
                          OrHighLow      413.43      (7.0%)      485.29      
(4.6%)   17.4% (   5% -   31%) 0.000
                          OrHighMed      191.17      (5.6%)      225.07      
(3.7%)   17.7% (   8% -   28%) 0.000
                        AndHighHigh       92.26      (5.6%)      108.84      
(3.2%)   18.0% (   8% -   28%) 0.000
                         OrHighHigh       59.24      (9.2%)       73.86      
(6.9%)   24.7% (   7% -   44%) 0.000
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] jpountz commented on pull request #12490: Reduce the overhead of ImpactsDISI.

Reply via email to