Tony-X commented on PR #12688:
URL: https://github.com/apache/lucene/pull/12688#issuecomment-1857371557
Since the first working version, I iterated with a list of profiling-guided
allocation optimizations, as they stood out quite obviously from the merged JFR
reports (thanks to luceneutil !).
Some of them comes from my code that implements the term dictionary data
lookup, and a few of them are at more general Lucene level. I want to highlight
the general issue I see from this work and maybe we can have separate issues to
improve them!
Here is the first heap profile comparison (search-only, no indexing).
```
Candidate Heap
17.50% 24440M java.lang.Long#valueOf()
10.09% 14096M
jdk.internal.misc.Unsafe#allocateUninitializedArray()
6.87% 9594M
org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#initializeValueCounters()
4.40% 6140M
org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()
...
```
```
main
13.65% 11898M
org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#initializeValueCounters()
9.26% 8071M org.apache.lucene.util.FixedBitSet#<init>()
6.70% 5836M
org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()
6.60% 5751M org.apache.lucene.util.ArrayUtil#growExact()
5.21% 4541M
org.apache.lucene.facet.FacetsConfig#stringToPath()
4.69% 4090M
org.apache.lucene.util.DocIdSetBuilder$Buffer#<init>()
```
## FST doesn't play nicely with primitive types (I know, this is more or
less a java issue)
`24440M java.lang.Long#valueOf()` huge amount of allocations... This is
obvious. The FST<T> implementation is generic over its output type and in my
case T is `Long`. So for trivial `long` add and subtract, the implementation
would allocate an object. Not only it is wasteful but from a perf perspective
it'd be less than 1 CPU cycle v.s. calling allocator which is easily tens if
not hundreds of cycles.
For this work, I forked the FST<T> class and manually templated it with long
just to see how much difference it makes. Here is a diff in heap profile and
bench results before and after.
```
Before
PERCENT HEAP SAMPLES STACK
25.97% 32791M java.lang.Long#valueOf()
7.58% 9571M
org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#initializeValueCounters()
5.13% 6482M org.apache.lucene.util.FixedBitSet#<init>()
4.90%
....
After
PERCENT HEAP SAMPLES STACK
8.44% 7988M
org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#initializeValueCounters()
7.17% 6788M org.apache.lucene.util.FixedBitSet#<init>()
6.22% 5886M
org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegmentNHLD()
5.89% 5577M org.apache.lucene.util.ArrayUtil#growExact()
```
Bench diff
```
Before
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
Wildcard 11.61 (2.7%) 2.40
(0.6%) -79.4% ( -80% - -78%) 0.000
Fuzzy1 78.17 (0.7%) 27.16
(0.9%) -65.3% ( -66% - -64%) 0.000
Respell 29.09 (0.6%) 10.91
(0.8%) -62.5% ( -63% - -61%) 0.000
Fuzzy2 47.80 (0.6%) 18.50
(1.2%) -61.3% ( -62% - -59%) 0.000
Prefix3 765.08 (3.1%) 463.94
(0.9%) -39.4% ( -42% - -36%) 0.000
HighTermTitleSort 98.48 (2.0%) 90.62
(2.2%) -8.0% ( -11% - -3%) 0.000
BrowseMonthTaxoFacets 3.89 (29.2%) 3.62
(0.9%) -6.9% ( -28% - 32%) 0.293
LowSloppyPhrase 22.73 (6.5%) 22.35
(6.9%) -1.7% ( -14% - 12%) 0.432
LowTerm 365.47 (3.6%) 359.62
(2.9%) -1.6% ( -7% - 5%) 0.121
HighTerm 398.57 (5.1%) 393.16
(4.7%) -1.4% ( -10% - 8%) 0.380
MedSloppyPhrase 10.63 (3.6%) 10.51
(3.7%) -1.1% ( -8% - 6%) 0.339
MedTerm 422.73 (4.2%) 418.60
(4.0%) -1.0% ( -8% - 7%) 0.451
MedTermDayTaxoFacets 14.84 (2.6%) 14.71
(2.5%) -0.8% ( -5% - 4%) 0.296
HighSloppyPhrase 12.41 (3.1%) 12.33
(3.1%) -0.7% ( -6% - 5%) 0.487
HighTermTitleBDVSort 6.88 (3.3%) 6.84
(3.5%) -0.6% ( -7% - 6%) 0.599
LowPhrase 58.15 (2.9%) 57.85
(2.8%) -0.5% ( -6% - 5%) 0.567
BrowseDayOfYearSSDVFacets 3.24 (0.4%) 3.23
(0.5%) -0.3% ( -1% - 0%) 0.042
MedPhrase 26.19 (3.1%) 26.11
(3.2%) -0.3% ( -6% - 6%) 0.775
OrNotHighMed 185.23 (3.9%) 184.73
(3.3%) -0.3% ( -7% - 7%) 0.813
OrHighMedDayTaxoFacets 3.82 (3.3%) 3.81
(3.2%) -0.3% ( -6% - 6%) 0.796
OrHighNotLow 194.98 (5.1%) 194.51
(4.6%) -0.2% ( -9% - 10%) 0.875
OrHighNotMed 337.15 (4.4%) 336.53
(3.8%) -0.2% ( -7% - 8%) 0.888
IntNRQ 67.60 (0.9%) 67.55
(1.0%) -0.1% ( -1% - 1%) 0.783
MedSpanNear 9.85 (1.4%) 9.84
(2.1%) -0.1% ( -3% - 3%) 0.906
OrNotHighHigh 205.12 (4.1%) 205.01
(3.9%) -0.1% ( -7% - 8%) 0.967
AndHighHighDayTaxoFacets 6.35 (1.5%) 6.34
(1.7%) -0.0% ( -3% - 3%) 0.932
BrowseMonthSSDVFacets 3.29 (0.8%) 3.29
(0.7%) -0.0% ( -1% - 1%) 0.887
BrowseRandomLabelSSDVFacets 2.30 (0.8%) 2.30
(1.0%) 0.0% ( -1% - 1%) 0.919
LowSpanNear 16.41 (2.6%) 16.42
(2.7%) 0.1% ( -5% - 5%) 0.931
HighPhrase 77.12 (3.0%) 77.20
(3.6%) 0.1% ( -6% - 6%) 0.923
AndHighMedDayTaxoFacets 39.64 (1.2%) 39.68
(1.0%) 0.1% ( -2% - 2%) 0.742
BrowseRandomLabelTaxoFacets 3.19 (1.6%) 3.19
(1.1%) 0.1% ( -2% - 2%) 0.728
BrowseDateTaxoFacets 3.73 (0.7%) 3.74
(0.5%) 0.3% ( 0% - 1%) 0.157
AndHighHigh 27.08 (1.3%) 27.15
(3.0%) 0.3% ( -4% - 4%) 0.718
BrowseDayOfYearTaxoFacets 3.76 (0.6%) 3.77
(0.5%) 0.3% ( 0% - 1%) 0.072
HighTermDayOfYearSort 224.01 (2.1%) 224.81
(2.1%) 0.4% ( -3% - 4%) 0.592
HighSpanNear 6.09 (2.7%) 6.11
(3.1%) 0.4% ( -5% - 6%) 0.683
HighIntervalsOrdered 8.08 (3.3%) 8.11
(3.4%) 0.4% ( -6% - 7%) 0.705
TermDTSort 103.29 (4.4%) 103.83
(3.1%) 0.5% ( -6% - 8%) 0.669
MedIntervalsOrdered 33.12 (4.4%) 33.29
(4.6%) 0.5% ( -8% - 9%) 0.702
LowIntervalsOrdered 10.06 (3.9%) 10.12
(3.6%) 0.6% ( -6% - 8%) 0.609
AndHighMed 73.71 (2.2%) 74.18
(2.5%) 0.6% ( -3% - 5%) 0.394
OrHighMed 71.44 (2.7%) 71.98
(3.3%) 0.7% ( -5% - 6%) 0.429
BrowseDateSSDVFacets 0.96 (4.8%) 0.97
(5.7%) 0.9% ( -9% - 11%) 0.601
OrHighNotHigh 308.82 (4.0%) 311.53
(3.7%) 0.9% ( -6% - 8%) 0.470
OrHighLow 404.69 (3.0%) 408.63
(3.5%) 1.0% ( -5% - 7%) 0.348
OrHighHigh 20.44 (4.7%) 20.73
(7.2%) 1.4% ( -10% - 13%) 0.469
OrNotHighLow 381.28 (1.8%) 388.18
(2.1%) 1.8% ( -2% - 5%) 0.004
HighTermMonthSort 2500.04 (2.2%) 2554.91
(4.3%) 2.2% ( -4% - 8%) 0.042
AndHighLow 668.12 (3.1%) 692.04
(3.9%) 3.6% ( -3% - 10%) 0.001
PKLookup 140.25 (2.0%) 168.53
(1.9%) 20.2% ( 15% - 24%) 0.000
After
TaskQPS baseline StdDevQPS
my_modified_version StdDev Pct diff p-value
Wildcard 54.96 (2.6%) 10.43
(0.5%) -81.0% ( -82% - -80%) 0.000
Respell 45.54 (1.0%) 16.74
(0.7%) -63.2% ( -64% - -62%) 0.000
Fuzzy1 46.41 (1.2%) 17.26
(1.0%) -62.8% ( -64% - -61%) 0.000
Prefix3 121.65 (2.4%) 55.57
(0.9%) -54.3% ( -56% - -52%) 0.000
Fuzzy2 32.33 (1.2%) 15.79
(1.1%) -51.2% ( -52% - -49%) 0.000
HighTermTitleSort 95.24 (2.1%) 87.04
(1.9%) -8.6% ( -12% - -4%) 0.000
BrowseRandomLabelSSDVFacets 2.37 (7.1%) 2.33
(4.8%) -1.7% ( -12% - 10%) 0.374
BrowseMonthSSDVFacets 3.34 (7.3%) 3.29
(0.6%) -1.5% ( -8% - 6%) 0.362
TermDTSort 120.57 (2.3%) 119.02
(3.4%) -1.3% ( -6% - 4%) 0.163
OrHighHigh 19.13 (5.6%) 18.92
(2.7%) -1.1% ( -8% - 7%) 0.430
AndHighHigh 22.04 (5.1%) 21.87
(3.0%) -0.8% ( -8% - 7%) 0.555
AndHighMed 55.06 (3.0%) 54.79
(2.1%) -0.5% ( -5% - 4%) 0.546
HighSpanNear 3.29 (1.6%) 3.28
(1.7%) -0.5% ( -3% - 2%) 0.346
HighIntervalsOrdered 0.65 (1.8%) 0.65
(2.0%) -0.5% ( -4% - 3%) 0.433
HighTermDayOfYearSort 282.86 (2.0%) 281.57
(2.6%) -0.5% ( -4% - 4%) 0.533
MedIntervalsOrdered 16.36 (1.5%) 16.29
(1.5%) -0.4% ( -3% - 2%) 0.369
OrHighMed 68.27 (2.9%) 67.99
(1.8%) -0.4% ( -5% - 4%) 0.598
MedSpanNear 3.22 (1.0%) 3.21
(1.4%) -0.4% ( -2% - 2%) 0.317
HighSloppyPhrase 9.59 (2.5%) 9.57
(2.6%) -0.3% ( -5% - 4%) 0.733
BrowseMonthTaxoFacets 3.64 (2.4%) 3.63
(1.8%) -0.2% ( -4% - 4%) 0.756
LowIntervalsOrdered 14.66 (0.9%) 14.63
(1.5%) -0.2% ( -2% - 2%) 0.633
MedTermDayTaxoFacets 15.56 (2.7%) 15.54
(3.9%) -0.2% ( -6% - 6%) 0.879
AndHighMedDayTaxoFacets 18.70 (1.4%) 18.67
(3.7%) -0.2% ( -5% - 5%) 0.864
LowSpanNear 4.39 (1.1%) 4.38
(1.4%) -0.1% ( -2% - 2%) 0.728
OrHighMedDayTaxoFacets 5.38 (3.5%) 5.38
(5.4%) -0.1% ( -8% - 9%) 0.945
AndHighHighDayTaxoFacets 7.06 (1.6%) 7.06
(3.0%) -0.1% ( -4% - 4%) 0.924
LowSloppyPhrase 7.16 (1.4%) 7.15
(1.6%) -0.1% ( -2% - 2%) 0.891
MedSloppyPhrase 128.54 (1.9%) 128.56
(2.2%) 0.0% ( -4% - 4%) 0.979
LowTerm 417.80 (3.3%) 418.01
(2.7%) 0.1% ( -5% - 6%) 0.958
LowPhrase 125.59 (4.0%) 125.77
(3.1%) 0.1% ( -6% - 7%) 0.900
OrHighLow 313.22 (2.1%) 313.72
(2.2%) 0.2% ( -4% - 4%) 0.817
BrowseDateTaxoFacets 3.73 (0.7%) 3.74
(0.7%) 0.2% ( -1% - 1%) 0.470
BrowseDayOfYearTaxoFacets 3.76 (0.7%) 3.76
(0.7%) 0.2% ( -1% - 1%) 0.457
MedTerm 384.57 (4.6%) 385.44
(3.6%) 0.2% ( -7% - 8%) 0.863
OrHighNotHigh 255.07 (4.3%) 256.05
(4.3%) 0.4% ( -7% - 9%) 0.778
MedPhrase 11.17 (3.0%) 11.21
(2.6%) 0.4% ( -5% - 6%) 0.658
HighTerm 361.26 (5.1%) 362.86
(4.2%) 0.4% ( -8% - 10%) 0.764
BrowseRandomLabelTaxoFacets 3.19 (1.5%) 3.20
(0.6%) 0.5% ( -1% - 2%) 0.203
OrNotHighHigh 205.38 (4.0%) 206.35
(4.0%) 0.5% ( -7% - 8%) 0.712
OrNotHighLow 317.96 (1.7%) 319.48
(2.1%) 0.5% ( -3% - 4%) 0.428
HighPhrase 47.91 (3.8%) 48.15
(3.3%) 0.5% ( -6% - 7%) 0.661
BrowseDateSSDVFacets 0.97 (6.9%) 0.98
(6.7%) 0.5% ( -12% - 15%) 0.801
OrHighNotLow 185.96 (4.9%) 187.04
(5.0%) 0.6% ( -8% - 11%) 0.710
BrowseDayOfYearSSDVFacets 3.21 (2.1%) 3.23
(0.9%) 0.6% ( -2% - 3%) 0.225
HighTermTitleBDVSort 5.83 (3.7%) 5.87
(4.0%) 0.7% ( -6% - 8%) 0.584
OrNotHighMed 516.84 (2.5%) 520.76
(2.5%) 0.8% ( -4% - 5%) 0.334
IntNRQ 29.24 (3.0%) 29.50
(4.1%) 0.9% ( -6% - 8%) 0.425
OrHighNotMed 268.45 (4.4%) 270.92
(4.2%) 0.9% ( -7% - 9%) 0.501
HighTermMonthSort 2498.46 (4.8%) 2590.43
(3.7%) 3.7% ( -4% - 12%) 0.007
AndHighLow 747.94 (2.1%) 775.60
(4.0%) 3.7% ( -2% - 10%) 0.000
PKLookup 141.68 (2.0%) 177.85
(1.5%) 25.5% ( 21% - 29%) 0.000
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]