[ https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180860#comment-17180860 ]
Gautam Worah commented on LUCENE-9450: -------------------------------------- Benchmark command with the same localrun.py script: {code:java} python src/python/localrun.py -source wikimediumall -forceMerge=True {code} Instance details: EC2 c4.8xlarge CPU (/proc/cpuinfo): {{Architecture: x86_64}} {{CPU op-mode(s): 32-bit, 64-bit}} {{Byte Order: Little Endian}} {{CPU(s): 36}} {{On-line CPU(s) list: 0-35}} {{Thread(s) per core: 2}} Memory: {{MemTotal: 61833408 kB}} {{MemFree: 1585956 kB}} {{MemAvailable: 53673248 kB}} Java: {{openjdk version "11.0.8" 2020-07-14 LTS}} {{OpenJDK Runtime Environment (build 11.0.8+10-LTS)}} {{OpenJDK 64-Bit Server VM (build 11.0.8+10-LTS, mixed mode)}} === Results: {code:java} TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff OrNotHighHigh 532.96 (6.0%) 518.91 (8.1%) -2.6% ( -15% - 12%) IntNRQ 63.90 (2.8%) 62.51 (3.7%) -2.2% ( -8% - 4%) LowTerm 1354.18 (4.9%) 1325.24 (7.1%) -2.1% ( -13% - 10%) PKLookup 130.95 (2.3%) 128.48 (3.1%) -1.9% ( -7% - 3%) Fuzzy1 49.16 (6.3%) 48.28 (9.8%) -1.8% ( -16% - 15%) OrHighNotMed 518.20 (4.8%) 509.19 (5.4%) -1.7% ( -11% - 8%) Wildcard 53.04 (2.1%) 52.14 (3.6%) -1.7% ( -7% - 4%) OrHighLow 241.89 (3.8%) 237.82 (4.7%) -1.7% ( -9% - 7%) MedTerm 1002.39 (5.2%) 985.94 (5.8%) -1.6% ( -12% - 9%) AndHighLow 453.14 (3.3%) 445.71 (5.9%) -1.6% ( -10% - 7%) HighTermMonthSort 22.03 (10.5%) 21.70 (10.3%) -1.5% ( -20% - 21%) MedPhrase 228.73 (3.2%) 225.69 (4.6%) -1.3% ( -8% - 6%) OrHighNotLow 461.58 (4.8%) 456.17 (6.9%) -1.2% ( -12% - 11%) HighTermTitleBDVSort 64.10 (10.5%) 63.35 (9.7%) -1.2% ( -19% - 21%) LowPhrase 134.07 (2.1%) 132.72 (2.8%) -1.0% ( -5% - 3%) OrHighMed 56.83 (2.9%) 56.44 (3.7%) -0.7% ( -7% - 6%) Prefix3 31.60 (2.1%) 31.42 (2.4%) -0.6% ( -4% - 4%) AndHighMed 61.02 (2.7%) 60.72 (3.2%) -0.5% ( -6% - 5%) HighPhrase 46.05 (3.2%) 45.84 (4.7%) -0.5% ( -8% - 7%) LowSpanNear 40.71 (2.4%) 40.54 (2.2%) -0.4% ( -4% - 4%) Fuzzy2 37.09 (7.5%) 36.94 (6.0%) -0.4% ( -12% - 14%) Respell 28.71 (1.5%) 28.60 (2.0%) -0.4% ( -3% - 3%) OrNotHighMed 375.44 (2.9%) 374.10 (5.0%) -0.4% ( -8% - 7%) AndHighHigh 32.52 (3.2%) 32.46 (3.3%) -0.2% ( -6% - 6%) HighSpanNear 9.05 (3.3%) 9.03 (3.6%) -0.2% ( -6% - 6%) MedSloppyPhrase 61.05 (1.8%) 61.03 (2.5%) -0.0% ( -4% - 4%) BrowseDayOfYearSSDVFacets 2.76 (0.8%) 2.75 (1.0%) -0.0% ( -1% - 1%) HighTerm 917.59 (4.6%) 917.50 (6.4%) -0.0% ( -10% - 11%) HighIntervalsOrdered 4.32 (2.5%) 4.32 (2.8%) 0.0% ( -5% - 5%) BrowseMonthSSDVFacets 2.94 (0.6%) 2.95 (0.6%) 0.1% ( -1% - 1%) OrHighHigh 6.22 (3.0%) 6.24 (3.3%) 0.2% ( -5% - 6%) OrHighNotHigh 425.45 (5.3%) 426.34 (6.8%) 0.2% ( -11% - 13%) LowSloppyPhrase 7.30 (2.6%) 7.32 (2.5%) 0.4% ( -4% - 5%) HighSloppyPhrase 7.52 (2.1%) 7.55 (2.8%) 0.4% ( -4% - 5%) MedSpanNear 11.22 (3.1%) 11.26 (3.0%) 0.4% ( -5% - 6%) OrNotHighLow 367.16 (3.6%) 368.76 (4.8%) 0.4% ( -7% - 9%) HighTermDayOfYearSort 35.05 (8.3%) 35.47 (9.1%) 1.2% ( -14% - 20%) BrowseMonthTaxoFacets 1.39 (2.6%) 1.44 (7.3%) 4.0% ( -5% - 14%) BrowseDateTaxoFacets 1.21 (2.4%) 1.26 (6.7%) 4.2% ( -4% - 13%) BrowseDayOfYearTaxoFacets 1.20 (2.8%) 1.26 (6.9%) 4.7% ( -4% - 14%) {code} > Taxonomy index should use DocValues not StoredFields > ---------------------------------------------------- > > Key: LUCENE-9450 > URL: https://issues.apache.org/jira/browse/LUCENE-9450 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/facet > Affects Versions: 8.5.2 > Reporter: Gautam Worah > Priority: Minor > Labels: performance > Attachments: LUCENE-9450-localrun.py-v1, wip_taxonomy_patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > The taxonomy index that maps binning labels to ordinals was created before > Lucene added BinaryDocValues. > I've attached a WIP patch (does not pass tests currently) > Issue suggested by [~mikemccand] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org