jpountz opened a new pull request, #13941:
URL: https://github.com/apache/lucene/pull/13941
It is sometimes possible for `MaxScoreBulkScorer` to compute windows that
don't contain many candidate matches, resulting in more time spent evaluating
maximum scores per window than evaluating candidate matches on this window.
This PR introduces a heuristic that tries to require at least 32 candidate
matches per clause per window to amortize the per-window overhead. This results
in a speedup for the `OrMany` task.
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
OrHighLow 830.99 (2.8%) 821.55
(2.0%) -1.1% ( -5% - 3%) 0.236
CountAndHighMed 149.53 (3.2%) 148.06
(1.8%) -1.0% ( -5% - 4%) 0.335
CountAndHighHigh 49.23 (3.3%) 48.85
(2.1%) -0.8% ( -6% - 4%) 0.483
OrHighRare 277.29 (5.9%) 275.20
(5.1%) -0.8% ( -11% - 10%) 0.728
LowTerm 1006.28 (2.7%) 999.28
(2.7%) -0.7% ( -5% - 4%) 0.512
OrHighNotMed 461.91 (2.0%) 459.09
(3.1%) -0.6% ( -5% - 4%) 0.556
AndHighMed 205.48 (2.0%) 204.44
(2.2%) -0.5% ( -4% - 3%) 0.547
HighTermTitleBDVSort 20.30 (4.4%) 20.22
(4.0%) -0.4% ( -8% - 8%) 0.798
OrHighNotLow 483.66 (2.2%) 481.97
(4.3%) -0.3% ( -6% - 6%) 0.794
OrNotHighHigh 283.34 (2.3%) 282.47
(2.0%) -0.3% ( -4% - 4%) 0.714
OrNotHighLow 1058.78 (3.5%) 1055.94
(2.6%) -0.3% ( -6% - 6%) 0.826
AndHighHigh 78.53 (1.8%) 78.33
(1.9%) -0.3% ( -3% - 3%) 0.721
OrHighHigh 77.35 (1.6%) 77.23
(1.6%) -0.2% ( -3% - 3%) 0.812
OrNotHighMed 314.20 (2.9%) 313.96
(2.7%) -0.1% ( -5% - 5%) 0.944
And2Terms2StopWords 155.15 (2.9%) 155.07
(1.8%) -0.0% ( -4% - 4%) 0.961
OrHighNotHigh 285.50 (2.5%) 285.63
(1.8%) 0.0% ( -4% - 4%) 0.958
CountOrHighMed 104.73 (1.6%) 104.95
(1.6%) 0.2% ( -2% - 3%) 0.744
And3Terms 167.95 (3.2%) 168.63
(2.6%) 0.4% ( -5% - 6%) 0.729
IntNRQ 90.83 (4.7%) 91.26
(14.9%) 0.5% ( -18% - 21%) 0.913
OrHighMed 200.80 (2.1%) 201.78
(1.7%) 0.5% ( -3% - 4%) 0.511
HighTermTitleSort 149.37 (2.5%) 150.20
(2.0%) 0.6% ( -3% - 5%) 0.528
CountOrHighHigh 49.93 (1.4%) 50.24
(1.5%) 0.6% ( -2% - 3%) 0.270
AndHighLow 1079.98 (2.6%) 1086.73
(3.6%) 0.6% ( -5% - 7%) 0.613
Or2Terms2StopWords 158.09 (4.1%) 159.09
(2.4%) 0.6% ( -5% - 7%) 0.630
HighTerm 515.68 (2.2%) 519.07
(2.6%) 0.7% ( -4% - 5%) 0.490
HighTermMonthSort 3222.57 (3.4%) 3244.84
(2.9%) 0.7% ( -5% - 7%) 0.576
MedTerm 582.99 (2.5%) 587.15
(2.5%) 0.7% ( -4% - 5%) 0.468
Wildcard 82.76 (4.3%) 83.45
(3.8%) 0.8% ( -6% - 9%) 0.599
AndStopWords 30.49 (4.7%) 30.77
(2.4%) 0.9% ( -5% - 8%) 0.537
HighTermDayOfYearSort 813.54 (3.4%) 821.97
(2.1%) 1.0% ( -4% - 6%) 0.355
PKLookup 272.42 (2.7%) 275.38
(2.5%) 1.1% ( -4% - 6%) 0.288
Or3Terms 166.90 (4.3%) 168.77
(2.7%) 1.1% ( -5% - 8%) 0.424
OrStopWords 33.64 (6.5%) 34.29
(3.2%) 1.9% ( -7% - 12%) 0.335
TermDTSort 344.04 (6.6%) 351.30
(5.3%) 2.1% ( -9% - 15%) 0.371
Prefix3 123.31 (3.5%) 126.03
(6.6%) 2.2% ( -7% - 12%) 0.286
CountTerm 8267.89 (4.4%) 8628.08
(4.7%) 4.4% ( -4% - 14%) 0.014
OrMany 13.25 (3.7%) 18.87
(3.7%) 42.4% ( 33% - 51%) 0.000
```
### Description
<!--
If this is your first contribution to Lucene, please make sure you have
reviewed the contribution guide.
https://github.com/apache/lucene/blob/main/CONTRIBUTING.md
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]