gsmiller commented on PR #11928:
URL: https://github.com/apache/lucene/pull/11928#issuecomment-1331288870
@jpountz I re-ran some internal benchmarking with this change to highlight
the speedup in cases where scoring isn't needed (at least some specific
use-cases I'm looking at). These use-cases all involve a "disjunction filter,"
meaning a disjunction of terms that is used as a required clause. So something
like `(+ (foo:bar foo:baz foo:zed) (...))`, where the `foo` field must take on
one of the specified values to be considered a candidate match. To provide a
sense of scale, on average, these filters have 40 different terms in them.
Since these "filters" don't participate in scoring at all, it's a good
candidate for this short-circuiting.
In these benchmarks, I'm observing a 2.3% QPS improvement, and a 3.5% avg.
latency reduction (5.9% p50 reduction / 3.5% p99 reduction). So the change
appears to be helping this type of situation.
As for whether-or-not this change would actually hurt other common use-cases
that require scoring or second-phase checks, I re-ran `luceneutil` benchmarks
(wikimedium10m) task and don't observe any regressions there (results below).
It's possible there's a gap in our benchmarks though, and maybe there are some
common use-cases not covered?
```
TaskQPS baseline StdDevQPS candidate
StdDev Pct diff p-value
MedSloppyPhrase 115.00 (4.9%) 112.91
(5.1%) -1.8% ( -11% - 8%) 0.249
HighTermTitleSort 242.48 (3.2%) 238.66
(4.3%) -1.6% ( -8% - 6%) 0.189
HighSloppyPhrase 36.47 (3.8%) 36.12
(3.9%) -0.9% ( -8% - 7%) 0.439
LowTerm 1766.82 (3.4%) 1752.12
(3.3%) -0.8% ( -7% - 6%) 0.436
HighPhrase 263.74 (3.4%) 261.71
(2.3%) -0.8% ( -6% - 5%) 0.404
OrHighLow 796.71 (2.7%) 790.71
(2.6%) -0.8% ( -5% - 4%) 0.367
BrowseDateSSDVFacets 3.46 (6.4%) 3.44
(7.1%) -0.7% ( -13% - 13%) 0.755
HighTermMonthSort 3070.86 (4.5%) 3051.39
(3.9%) -0.6% ( -8% - 8%) 0.635
Prefix3 111.76 (4.6%) 111.08
(4.2%) -0.6% ( -8% - 8%) 0.658
OrNotHighHigh 1249.27 (3.4%) 1242.30
(3.8%) -0.6% ( -7% - 6%) 0.627
BrowseMonthTaxoFacets 35.43 (1.6%) 35.23
(2.2%) -0.6% ( -4% - 3%) 0.367
LowSloppyPhrase 61.49 (2.4%) 61.18
(2.5%) -0.5% ( -5% - 4%) 0.512
OrHighNotMed 1139.05 (3.9%) 1133.29
(3.6%) -0.5% ( -7% - 7%) 0.671
BrowseRandomLabelTaxoFacets 20.33 (4.5%) 20.23
(5.6%) -0.5% ( -10% - 10%) 0.760
HighTerm 1635.45 (4.2%) 1628.28
(3.9%) -0.4% ( -8% - 8%) 0.735
MedPhrase 46.41 (2.3%) 46.22
(1.6%) -0.4% ( -4% - 3%) 0.529
OrHighMed 193.55 (2.8%) 192.79
(2.9%) -0.4% ( -5% - 5%) 0.663
OrHighNotHigh 865.20 (3.0%) 862.38
(3.9%) -0.3% ( -6% - 6%) 0.766
AndHighLow 1566.83 (2.7%) 1562.47
(2.7%) -0.3% ( -5% - 5%) 0.745
MedTermDayTaxoFacets 48.00 (3.5%) 47.89
(3.6%) -0.2% ( -7% - 7%) 0.836
HighTermDayOfYearSort 812.55 (2.8%) 811.49
(2.5%) -0.1% ( -5% - 5%) 0.878
MedTerm 2390.59 (3.5%) 2387.70
(3.5%) -0.1% ( -6% - 7%) 0.912
BrowseDateTaxoFacets 25.05 (9.2%) 25.03
(9.3%) -0.1% ( -16% - 20%) 0.984
BrowseMonthSSDVFacets 16.00 (18.9%) 15.99
(19.0%) -0.1% ( -31% - 46%) 0.992
LowIntervalsOrdered 276.77 (3.5%) 276.66
(3.6%) -0.0% ( -6% - 7%) 0.972
OrHighHigh 26.41 (4.4%) 26.40
(4.4%) -0.0% ( -8% - 9%) 0.978
MedSpanNear 35.38 (1.0%) 35.38
(1.1%) -0.0% ( -2% - 2%) 0.969
TermDTSort 648.08 (1.0%) 648.35
(1.3%) 0.0% ( -2% - 2%) 0.909
BrowseDayOfYearTaxoFacets 25.12 (9.5%) 25.13
(9.5%) 0.0% ( -17% - 21%) 0.988
OrNotHighLow 1192.74 (3.1%) 1193.69
(2.6%) 0.1% ( -5% - 5%) 0.929
HighIntervalsOrdered 1.79 (3.3%) 1.79
(3.3%) 0.1% ( -6% - 6%) 0.934
Respell 46.93 (1.3%) 46.97
(1.3%) 0.1% ( -2% - 2%) 0.825
LowSpanNear 11.96 (0.8%) 11.97
(0.9%) 0.1% ( -1% - 1%) 0.702
BrowseRandomLabelSSDVFacets 10.14 (9.0%) 10.16
(9.4%) 0.1% ( -16% - 20%) 0.961
LowPhrase 438.44 (1.6%) 439.23
(1.4%) 0.2% ( -2% - 3%) 0.701
AndHighHighDayTaxoFacets 28.52 (1.9%) 28.57
(1.8%) 0.2% ( -3% - 3%) 0.746
MedIntervalsOrdered 64.28 (2.3%) 64.42
(2.3%) 0.2% ( -4% - 4%) 0.777
OrHighNotLow 1854.91 (2.7%) 1858.96
(3.9%) 0.2% ( -6% - 6%) 0.836
HighSpanNear 12.65 (1.4%) 12.69
(1.7%) 0.3% ( -2% - 3%) 0.579
OrHighMedDayTaxoFacets 14.34 (3.0%) 14.38
(4.3%) 0.3% ( -6% - 7%) 0.809
Fuzzy1 81.66 (0.9%) 81.90
(1.1%) 0.3% ( -1% - 2%) 0.357
BrowseDayOfYearSSDVFacets 13.40 (12.7%) 13.44
(12.5%) 0.3% ( -22% - 29%) 0.942
AndHighMedDayTaxoFacets 34.33 (2.1%) 34.45
(1.7%) 0.3% ( -3% - 4%) 0.569
Fuzzy2 76.36 (0.8%) 76.64
(0.9%) 0.4% ( -1% - 2%) 0.187
HighTermTitleBDVSort 12.49 (6.6%) 12.55
(6.6%) 0.4% ( -11% - 14%) 0.846
PKLookup 179.34 (2.2%) 180.48
(2.3%) 0.6% ( -3% - 5%) 0.364
AndHighMed 268.67 (3.8%) 270.48
(3.9%) 0.7% ( -6% - 8%) 0.579
OrNotHighMed 1006.62 (4.1%) 1013.59
(4.3%) 0.7% ( -7% - 9%) 0.604
Wildcard 133.41 (2.7%) 134.34
(2.7%) 0.7% ( -4% - 6%) 0.417
AndHighHigh 74.34 (4.7%) 74.95
(5.2%) 0.8% ( -8% - 11%) 0.608
IntNRQ 82.56 (3.5%) 83.70
(4.9%) 1.4% ( -6% - 10%) 0.306
```
Given this, I'd prefer to move forward with the change (since it doesn't
seem to be doing any harm to common cases covered in `luceneutil` and is
helping in the filtering use-cases I'm benchmarking), but I'd like your
opinion. Given how central of a component this is to scoring, I want to make
sure I'm not being overly biased towards the use-case I'm focused on at the
cost of regressing elsewhere. Thanks again for your inputs!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]