[ https://issues.apache.org/jira/browse/LUCENE-10319 ]
Feng Guo deleted comment on LUCENE-10319:
-----------------------------------
was (Author: gf2121):
Out of curiosity, I run the luceneutil wikimedium1m for block size = 64 / 256,
I post the result here in case someone would be interested in this :)
*BLOCK_SIZE=64*
{{Index size:}}
{{434M (block size = 128)}}
{{446M (block size = 64)}}
{code:java}
TaskQPS baseline StdDevQPS my_modified_version
StdDev Pct diff p-value
AndHighMed 742.46 (6.2%) 632.83
(3.9%) -14.8% ( -23% - -4%) 0.000
MedSpanNear 106.50 (2.8%) 92.48
(3.7%) -13.2% ( -19% - -6%) 0.000
MedSloppyPhrase 147.88 (3.0%) 128.80
(2.2%) -12.9% ( -17% - -7%) 0.000
LowSloppyPhrase 491.02 (3.7%) 428.92
(3.5%) -12.6% ( -19% - -5%) 0.000
LowSpanNear 332.59 (3.0%) 292.64
(3.8%) -12.0% ( -18% - -5%) 0.000
MedIntervalsOrdered 80.37 (3.3%) 71.33
(2.6%) -11.2% ( -16% - -5%) 0.000
LowIntervalsOrdered 163.87 (3.1%) 145.73
(2.2%) -11.1% ( -15% - -5%) 0.000
HighSloppyPhrase 137.71 (3.8%) 122.61
(3.4%) -11.0% ( -17% - -3%) 0.000
LowTerm 2787.22 (6.1%) 2488.95
(6.1%) -10.7% ( -21% - 1%) 0.000
OrHighHigh 160.41 (3.1%) 144.06
(3.7%) -10.2% ( -16% - -3%) 0.000
HighSpanNear 140.00 (1.7%) 127.69
(3.0%) -8.8% ( -13% - -4%) 0.000
OrHighMed 258.10 (4.3%) 235.96
(4.6%) -8.6% ( -16% - 0%) 0.000
HighIntervalsOrdered 257.27 (3.0%) 242.95
(4.8%) -5.6% ( -12% - 2%) 0.000
AndHighHigh 248.63 (3.0%) 234.84
(3.2%) -5.5% ( -11% - 0%) 0.000
HighTermDayOfYearSort 954.02 (9.5%) 905.20
(7.4%) -5.1% ( -20% - 13%) 0.058
AndHighLow 1550.86 (5.0%) 1498.68
(4.5%) -3.4% ( -12% - 6%) 0.026
HighTermMonthSort 633.80 (10.4%) 613.68
(5.9%) -3.2% ( -17% - 14%) 0.236
LowPhrase 547.94 (3.9%) 534.39
(3.1%) -2.5% ( -9% - 4%) 0.027
Prefix3 566.20 (11.3%) 554.74
(8.9%) -2.0% ( -19% - 20%) 0.529
MedPhrase 468.94 (3.0%) 461.20
(4.8%) -1.7% ( -9% - 6%) 0.192
Respell 149.39 (3.9%) 147.07
(5.3%) -1.6% ( -10% - 7%) 0.287
OrHighLow 908.68 (5.2%) 899.50
(5.3%) -1.0% ( -10% - 10%) 0.542
Fuzzy2 75.80 (10.0%) 75.37
(12.6%) -0.6% ( -21% - 24%) 0.876
BrowseMonthSSDVFacets 151.56 (0.7%) 150.73
(2.8%) -0.5% ( -4% - 2%) 0.399
Fuzzy1 117.46 (14.0%) 116.84
(12.6%) -0.5% ( -23% - 30%) 0.899
BrowseDayOfYearSSDVFacets 139.72 (0.9%) 139.01
(1.8%) -0.5% ( -3% - 2%) 0.250
Wildcard 418.32 (11.7%) 416.56
(11.3%) -0.4% ( -20% - 25%) 0.908
IntNRQ 641.72 (5.4%) 643.10
(5.5%) 0.2% ( -10% - 11%) 0.900
HighPhrase 547.62 (6.0%) 549.35
(11.0%) 0.3% ( -15% - 18%) 0.910
BrowseDateTaxoFacets 29.02 (2.9%) 29.40
(5.3%) 1.3% ( -6% - 9%) 0.336
BrowseMonthTaxoFacets 31.12 (3.7%) 31.52
(6.4%) 1.3% ( -8% - 11%) 0.430
BrowseDayOfYearTaxoFacets 29.03 (3.2%) 29.42
(5.3%) 1.4% ( -6% - 10%) 0.328
PKLookup 239.41 (2.5%) 242.82
(4.0%) 1.4% ( -4% - 8%) 0.174
MedTerm 2332.72 (4.5%) 2445.01
(4.6%) 4.8% ( -4% - 14%) 0.001
HighTerm 1835.22 (5.3%) 1935.28
(6.0%) 5.5% ( -5% - 17%) 0.002
{code}
*BLOCK_SIZE=256*
{{Index size:}}
{{434M (block size = 128)}}
{{438M (block size = 256)}}
{code:java}
TaskQPS baseline StdDevQPS my_modified_version
StdDev Pct diff p-value
AndHighHigh 214.93 (3.8%) 183.83
(2.6%) -14.5% ( -20% - -8%) 0.000
MedTerm 2589.52 (4.5%) 2303.67
(5.5%) -11.0% ( -20% - -1%) 0.000
HighTerm 1750.90 (4.0%) 1560.54
(4.3%) -10.9% ( -18% - -2%) 0.000
HighPhrase 238.61 (2.8%) 218.08
(4.3%) -8.6% ( -15% - -1%) 0.000
OrHighHigh 117.03 (1.9%) 107.52
(4.8%) -8.1% ( -14% - -1%) 0.000
HighTermMonthSort 905.11 (10.5%) 864.34
(9.3%) -4.5% ( -21% - 17%) 0.150
HighTermDayOfYearSort 1095.73 (10.4%) 1056.20
(11.0%) -3.6% ( -22% - 19%) 0.288
PKLookup 249.62 (3.8%) 241.15
(4.6%) -3.4% ( -11% - 5%) 0.011
LowTerm 2761.54 (4.6%) 2681.22
(6.8%) -2.9% ( -13% - 8%) 0.111
Respell 163.65 (3.4%) 159.17
(3.8%) -2.7% ( -9% - 4%) 0.016
Wildcard 587.89 (2.9%) 573.02
(4.8%) -2.5% ( -9% - 5%) 0.044
IntNRQ 654.86 (4.4%) 644.88
(5.4%) -1.5% ( -10% - 8%) 0.328
LowPhrase 596.01 (4.3%) 587.28
(5.5%) -1.5% ( -10% - 8%) 0.349
HighIntervalsOrdered 16.48 (8.9%) 16.26
(6.4%) -1.3% ( -15% - 15%) 0.586
AndHighLow 1665.94 (6.4%) 1649.07
(6.1%) -1.0% ( -12% - 12%) 0.610
BrowseDayOfYearSSDVFacets 142.76 (2.5%) 141.87
(3.3%) -0.6% ( -6% - 5%) 0.507
BrowseDateTaxoFacets 29.49 (4.2%) 29.40
(3.8%) -0.3% ( -8% - 8%) 0.796
MedPhrase 653.42 (4.6%) 652.05
(5.6%) -0.2% ( -9% - 10%) 0.897
Fuzzy1 116.77 (6.3%) 116.59
(10.4%) -0.2% ( -15% - 17%) 0.956
BrowseDayOfYearTaxoFacets 29.58 (4.3%) 29.55
(4.1%) -0.1% ( -8% - 8%) 0.929
Fuzzy2 73.12 (10.4%) 73.04
(10.7%) -0.1% ( -19% - 23%) 0.974
BrowseMonthTaxoFacets 31.65 (5.0%) 31.64
(4.9%) -0.0% ( -9% - 10%) 0.985
BrowseMonthSSDVFacets 155.25 (3.5%) 155.27
(3.8%) 0.0% ( -7% - 7%) 0.991
OrHighMed 267.80 (5.9%) 268.44
(6.2%) 0.2% ( -11% - 13%) 0.900
OrHighLow 820.94 (8.5%) 832.70
(7.8%) 1.4% ( -13% - 19%) 0.579
Prefix3 483.34 (5.8%) 490.76
(7.1%) 1.5% ( -10% - 15%) 0.453
LowSloppyPhrase 268.01 (2.2%) 279.16
(3.9%) 4.2% ( -1% - 10%) 0.000
LowSpanNear 518.44 (3.8%) 542.08
(5.2%) 4.6% ( -4% - 14%) 0.002
MedSloppyPhrase 252.28 (2.4%) 264.31
(2.2%) 4.8% ( 0% - 9%) 0.000
HighSloppyPhrase 157.88 (2.6%) 165.44
(3.1%) 4.8% ( 0% - 10%) 0.000
HighSpanNear 232.57 (2.5%) 243.72
(3.5%) 4.8% ( -1% - 11%) 0.000
LowIntervalsOrdered 697.59 (3.8%) 734.23
(4.8%) 5.3% ( -3% - 14%) 0.000
MedSpanNear 171.60 (3.1%) 181.41
(4.4%) 5.7% ( -1% - 13%) 0.000
MedIntervalsOrdered 356.52 (3.1%) 383.69
(4.1%) 7.6% ( 0% - 15%) 0.000
AndHighMed 555.66 (4.4%) 617.40
(5.7%) 11.1% ( 0% - 22%) 0.000
{code}
> Make ForUtil#BLOCK_SIZE changeable
> ----------------------------------
>
> Key: LUCENE-10319
> URL: https://issues.apache.org/jira/browse/LUCENE-10319
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/codecs
> Reporter: Feng Guo
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In LUCENE-10315, I tried to generate a {{ForUtil}} whose
> {{{}BLOCK_SIZE=512{}}}, I thought it could be simple since it looks like i
> only need to change the BLOCK_SIZE, but it turns out that there are a lot of
> values related to the BLOCK_SIZE but hard coded.
> So this is trying to make all hard code value generated from the BLOCK_SIZE
> in case we need a ForUtil somewhere else or want to change BLOCK_SIZE in
> postings in feature.
> I tried to make the BLOCK_SIZE = 64 / 256 and all tests passed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]