[
https://issues.apache.org/jira/browse/LUCENE-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772153#comment-16772153
]
Michael McCandless commented on LUCENE-8635:
--------------------------------------------
I ran luceneutil on {{wikimediumall}} with current trunk vs PR here – net/net
looks like noise, which is great – I'll push shortly:
{noformat}
Report after iter 19:
Task QPS base StdDev QPS comp StdDev
Pct diff
Prefix3 37.05 (11.4%) 36.25 (13.0%)
-2.1% ( -23% - 25%)
BrowseMonthSSDVFacets 5.01 (6.4%) 4.91 (10.4%)
-1.9% ( -17% - 15%)
BrowseMonthTaxoFacets 1.24 (2.7%) 1.22 (4.8%)
-1.3% ( -8% - 6%)
Wildcard 106.53 (8.6%) 105.18 (9.1%)
-1.3% ( -17% - 18%)
HighTermDayOfYearSort 14.85 (4.2%) 14.70 (4.2%)
-1.0% ( -9% - 7%)
BrowseDateTaxoFacets 1.11 (3.2%) 1.10 (5.6%)
-0.8% ( -9% - 8%)
BrowseDayOfYearTaxoFacets 1.11 (3.1%) 1.10 (5.6%)
-0.8% ( -9% - 8%)
MedSloppyPhrase 4.59 (3.4%) 4.56 (2.8%)
-0.5% ( -6% - 5%)
Fuzzy2 68.49 (1.0%) 68.12 (1.3%)
-0.5% ( -2% - 1%)
LowSpanNear 30.34 (1.7%) 30.19 (1.9%)
-0.5% ( -4% - 3%)
Fuzzy1 72.43 (0.9%) 72.10 (1.4%)
-0.5% ( -2% - 1%)
LowPhrase 34.35 (1.1%) 34.22 (2.0%)
-0.4% ( -3% - 2%)
Respell 47.66 (1.4%) 47.48 (1.7%)
-0.4% ( -3% - 2%)
LowSloppyPhrase 10.59 (4.9%) 10.56 (3.6%)
-0.3% ( -8% - 8%)
HighTerm 1290.39 (1.8%) 1286.15 (1.4%)
-0.3% ( -3% - 2%)
MedTerm 1419.25 (2.0%) 1415.23 (1.5%)
-0.3% ( -3% - 3%)
IntNRQ 27.03 (11.0%) 26.96 (10.9%)
-0.3% ( -19% - 24%)
HighSloppyPhrase 6.73 (4.9%) 6.71 (3.4%)
-0.3% ( -8% - 8%)
OrNotHighHigh 825.79 (1.9%) 823.77 (1.4%)
-0.2% ( -3% - 3%)
OrNotHighMed 912.80 (1.3%) 910.96 (1.3%)
-0.2% ( -2% - 2%)
MedPhrase 29.52 (1.1%) 29.46 (1.9%)
-0.2% ( -3% - 2%)
OrHighNotLow 1184.54 (3.1%) 1182.86 (1.8%)
-0.1% ( -4% - 4%)
LowTerm 974.30 (1.5%) 973.33 (1.4%)
-0.1% ( -2% - 2%)
OrHighLow 328.39 (1.0%) 328.13 (1.0%)
-0.1% ( -2% - 1%)
AndHighHigh 21.04 (2.8%) 21.03 (2.6%)
-0.1% ( -5% - 5%)
OrHighNotHigh 907.78 (1.8%) 907.93 (1.4%)
0.0% ( -3% - 3%)
OrHighNotMed 1019.49 (2.0%) 1019.67 (1.4%)
0.0% ( -3% - 3%)
AndHighMed 64.27 (1.1%) 64.33 (1.1%)
0.1% ( -2% - 2%)
OrNotHighLow 414.78 (1.2%) 415.43 (1.0%)
0.2% ( -2% - 2%)
BrowseDayOfYearSSDVFacets 4.14 (6.9%) 4.15 (8.9%)
0.2% ( -14% - 17%)
AndHighLow 371.09 (1.7%) 371.84 (1.7%)
0.2% ( -3% - 3%)
OrHighMed 65.31 (1.8%) 65.45 (1.8%)
0.2% ( -3% - 3%)
PKLookup 141.21 (1.6%) 141.63 (1.9%)
0.3% ( -3% - 3%)
HighSpanNear 25.84 (2.8%) 25.94 (2.6%)
0.4% ( -4% - 5%)
MedSpanNear 26.39 (2.9%) 26.50 (2.8%)
0.4% ( -5% - 6%)
HighPhrase 11.72 (2.1%) 11.77 (1.9%)
0.4% ( -3% - 4%)
OrHighHigh 14.60 (2.2%) 14.69 (1.8%)
0.6% ( -3% - 4%)
HighTermMonthSort 31.51 (6.0%) 31.90 (6.0%)
1.2% ( -10% - 14%){noformat}
> Lazy loading Lucene FST offheap using mmap
> ------------------------------------------
>
> Key: LUCENE-8635
> URL: https://issues.apache.org/jira/browse/LUCENE-8635
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/FSTs
> Environment: I used below setup for es_rally tests:
> single node i3.xlarge running ES 6.5
> es_rally was running on another i3.xlarge instance
> Reporter: Ankit Jain
> Priority: Major
> Attachments: fst-offheap-ra-rev.patch, fst-offheap-rev.patch,
> offheap.patch, optional_offheap_ra.patch, ra.patch, rally_benchmark.xlsx
>
>
> Currently, FST loads all the terms into heap memory during index open. This
> causes frequent JVM OOM issues if the term size gets big. A better way of
> doing this will be to lazily load FST using mmap. That ensures only the
> required terms get loaded into memory.
>
> Lucene can expose API for providing list of fields to load terms offheap. I'm
> planning to take following approach for this:
> # Add a boolean property fstOffHeap in FieldInfo
> # Pass list of offheap fields to lucene during index open (ALL can be
> special keyword for loading ALL fields offheap)
> # Initialize the fstOffHeap property during lucene index open
> # FieldReader invokes default FST constructor or OffHeap constructor based
> on fstOffHeap field
>
> I created a patch (that loads all fields offheap), did some benchmarks using
> es_rally and results look good.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]