HUSTERGS opened a new pull request, #15058:
URL: https://github.com/apache/lucene/pull/15058
### Description
This PR propose to sort all DISIs before actual scoring. Since the
competitive iterator under the hood is `UpdateableDocIdSetIterator`, its cost
can vary during the process of scoring. This change can yield a small speed up
on `TermDayOfYearSort` task (but not any other sort type tasks : ( ) , and
didn't cause any obvious slowdown
```
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
OrHighHigh 21.44 (2.4%) 21.04
(8.4%) -1.8% ( -12% - 9%) 0.345
OrHighMed 64.68 (4.6%) 63.68
(8.1%) -1.5% ( -13% - 11%) 0.462
DismaxOrHighMed 48.46 (3.9%) 47.73
(5.6%) -1.5% ( -10% - 8%) 0.317
DismaxOrHighHigh 34.71 (3.5%) 34.27
(4.9%) -1.3% ( -9% - 7%) 0.342
OrStopWords 9.02 (2.3%) 8.92
(6.8%) -1.1% ( -9% - 8%) 0.487
CountAndHighHigh 48.53 (2.4%) 48.00
(2.3%) -1.1% ( -5% - 3%) 0.140
Or3Terms 62.84 (4.4%) 62.18
(7.0%) -1.0% ( -11% - 10%) 0.572
OrHighRare 92.66 (7.3%) 91.72
(8.3%) -1.0% ( -15% - 15%) 0.683
Or2Terms2StopWords 57.36 (6.6%) 56.88
(7.7%) -0.8% ( -14% - 14%) 0.710
Phrase 7.59 (2.8%) 7.53
(2.9%) -0.8% ( -6% - 5%) 0.397
CountPhrase 2.69 (2.6%) 2.67
(2.3%) -0.8% ( -5% - 4%) 0.328
CountFilteredOrHighMed 17.90 (0.6%) 17.80
(0.6%) -0.6% ( -1% - 0%) 0.001
CountFilteredOrHighHigh 15.82 (0.9%) 15.74
(0.8%) -0.5% ( -2% - 1%) 0.057
SpanNear 2.48 (5.6%) 2.47
(4.4%) -0.4% ( -9% - 10%) 0.793
CountFilteredIntNRQ 16.37 (1.1%) 16.30
(0.9%) -0.4% ( -2% - 1%) 0.206
SloppyPhrase 1.15 (5.4%) 1.14
(4.6%) -0.4% ( -9% - 10%) 0.805
FilteredAndStopWords 8.46 (2.4%) 8.43
(3.2%) -0.4% ( -5% - 5%) 0.678
TermB1M 439.67 (5.4%) 438.10
(4.6%) -0.4% ( -9% - 10%) 0.823
TermB1M1P 439.98 (5.4%) 438.60
(4.5%) -0.3% ( -9% - 10%) 0.843
Term100 439.49 (5.4%) 438.24
(4.5%) -0.3% ( -9% - 10%) 0.856
FilteredAndHighHigh 10.51 (2.5%) 10.48
(3.0%) -0.3% ( -5% - 5%) 0.747
And2Terms2StopWords 55.84 (6.9%) 55.69
(7.4%) -0.3% ( -13% - 15%) 0.906
CountTerm 5627.44 (5.2%) 5614.10
(2.7%) -0.2% ( -7% - 8%) 0.857
Term1M 439.33 (5.4%) 438.29
(4.6%) -0.2% ( -9% - 10%) 0.881
FilteredOrStopWords 8.03 (2.4%) 8.01
(3.0%) -0.2% ( -5% - 5%) 0.795
Fuzzy1 38.44 (4.3%) 38.36
(4.3%) -0.2% ( -8% - 8%) 0.869
FilteredAnd2Terms2StopWords 57.86 (4.9%) 57.75
(5.6%) -0.2% ( -10% - 10%) 0.914
Fuzzy2 34.84 (4.0%) 34.79
(4.4%) -0.2% ( -8% - 8%) 0.904
Term10K 439.23 (5.5%) 438.60
(4.5%) -0.1% ( -9% - 10%) 0.929
AndStopWords 9.05 (2.2%) 9.03
(2.6%) -0.1% ( -4% - 4%) 0.857
Term 439.27 (5.3%) 438.68
(4.7%) -0.1% ( -9% - 10%) 0.933
IntNRQ 42.91 (2.5%) 42.86
(2.4%) -0.1% ( -4% - 4%) 0.870
FilteredPhrase 9.59 (2.3%) 9.58
(3.0%) -0.1% ( -5% - 5%) 0.889
FilteredIntNRQ 42.53 (2.5%) 42.48
(2.5%) -0.1% ( -5% - 5%) 0.884
FilteredAndHighMed 31.33 (2.6%) 31.29
(3.1%) -0.1% ( -5% - 5%) 0.911
AndHighOrMedMed 13.83 (2.4%) 13.82
(3.0%) -0.1% ( -5% - 5%) 0.927
FilteredTerm 61.65 (2.8%) 61.62
(3.3%) -0.0% ( -5% - 6%) 0.970
AndMedOrHighHigh 16.50 (2.5%) 16.49
(2.5%) -0.0% ( -4% - 5%) 0.978
And3Terms 70.81 (4.4%) 70.80
(5.0%) -0.0% ( -9% - 9%) 0.990
OrMany 4.45 (5.0%) 4.45
(5.6%) 0.0% ( -10% - 11%) 0.998
AndHighMed 53.05 (3.5%) 53.08
(4.0%) 0.1% ( -7% - 7%) 0.965
DismaxTerm 481.30 (3.9%) 481.56
(3.2%) 0.1% ( -6% - 7%) 0.962
TermTitleSort 49.81 (5.1%) 49.84
(4.4%) 0.1% ( -8% - 10%) 0.971
FilteredOr3Terms 42.22 (3.9%) 42.25
(4.5%) 0.1% ( -8% - 8%) 0.959
FilteredOr2Terms2StopWords 47.02 (4.8%) 47.06
(5.5%) 0.1% ( -9% - 10%) 0.958
FilteredOrHighMed 37.25 (4.3%) 37.31
(4.8%) 0.1% ( -8% - 9%) 0.919
TermDTSort 137.63 (3.7%) 137.84
(3.1%) 0.2% ( -6% - 7%) 0.889
CombinedOrHighMed 20.14 (5.4%) 20.17
(4.5%) 0.2% ( -9% - 10%) 0.921
TermMonthSort 2073.62 (3.1%) 2077.60
(2.3%) 0.2% ( -5% - 5%) 0.823
FilteredOrHighHigh 12.66 (3.2%) 12.68
(3.5%) 0.2% ( -6% - 7%) 0.855
CombinedAndHighMed 20.25 (5.0%) 20.29
(4.6%) 0.2% ( -8% - 10%) 0.888
CountFilteredPhrase 8.73 (3.2%) 8.75
(3.3%) 0.2% ( -6% - 6%) 0.814
IntervalsOrdered 2.42 (3.0%) 2.43
(2.2%) 0.3% ( -4% - 5%) 0.751
CountAndHighMed 74.92 (2.3%) 75.12
(2.7%) 0.3% ( -4% - 5%) 0.737
FilteredOrMany 3.93 (2.4%) 3.94
(3.2%) 0.3% ( -5% - 6%) 0.760
CountFilteredOrMany 4.33 (3.1%) 4.34
(3.1%) 0.3% ( -5% - 6%) 0.777
CombinedAndHighHigh 5.70 (1.7%) 5.72
(1.4%) 0.3% ( -2% - 3%) 0.555
Prefix3 75.38 (3.4%) 75.60
(3.0%) 0.3% ( -5% - 6%) 0.770
CombinedOrHighHigh 5.69 (2.5%) 5.70
(1.8%) 0.3% ( -3% - 4%) 0.663
CombinedTerm 11.16 (3.6%) 11.20
(2.9%) 0.3% ( -5% - 7%) 0.766
AndHighHigh 22.48 (2.4%) 22.56
(3.0%) 0.3% ( -4% - 5%) 0.696
FilteredPrefix3 70.52 (3.2%) 70.77
(3.0%) 0.4% ( -5% - 6%) 0.716
FilteredAnd3Terms 102.42 (2.8%) 102.81
(3.6%) 0.4% ( -5% - 6%) 0.700
Respell 35.37 (3.8%) 35.58
(3.3%) 0.6% ( -6% - 8%) 0.597
CountOrHighHigh 49.89 (2.6%) 50.22
(2.4%) 0.7% ( -4% - 5%) 0.393
CountOrMany 4.87 (3.4%) 4.91
(3.4%) 0.8% ( -5% - 7%) 0.434
CountOrHighMed 77.32 (2.0%) 78.00
(2.9%) 0.9% ( -3% - 5%) 0.266
IntSet 283.71 (4.1%) 287.69
(5.0%) 1.4% ( -7% - 10%) 0.333
Wildcard 46.80 (3.9%) 47.63
(2.9%) 1.8% ( -4% - 8%) 0.104
TermDayOfYearSort 253.15 (2.2%) 260.00
(1.9%) 2.7% ( -1% - 6%) 0.000
```
<!--
If this is your first contribution to Lucene, please make sure you have
reviewed the contribution guide.
https://github.com/apache/lucene/blob/main/CONTRIBUTING.md
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]