[ 
https://issues.apache.org/jira/browse/LUCENE-9450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180860#comment-17180860
 ] 

Gautam Worah commented on LUCENE-9450:
--------------------------------------

Benchmark command with the same localrun.py script:
{code:java}
 python src/python/localrun.py -source wikimediumall -forceMerge=True
{code}
Instance details:

EC2 c4.8xlarge

CPU (/proc/cpuinfo):

{{Architecture: x86_64}}
 {{CPU op-mode(s): 32-bit, 64-bit}}
 {{Byte Order: Little Endian}}
 {{CPU(s): 36}}
 {{On-line CPU(s) list: 0-35}}
 {{Thread(s) per core: 2}}

Memory:

{{MemTotal: 61833408 kB}}
 {{MemFree: 1585956 kB}}
 {{MemAvailable: 53673248 kB}}

Java:

{{openjdk version "11.0.8" 2020-07-14 LTS}}
{{OpenJDK Runtime Environment (build 11.0.8+10-LTS)}}
{{OpenJDK 64-Bit Server VM (build 11.0.8+10-LTS, mixed mode)}}

 

===

 Results:

 
{code:java}
                    TaskQPS baseline      StdDevQPS my_modified_version      
StdDev                Pct diff
           OrNotHighHigh      532.96      (6.0%)      518.91      (8.1%)   
-2.6% ( -15% -   12%)
                  IntNRQ       63.90      (2.8%)       62.51      (3.7%)   
-2.2% (  -8% -    4%)
                 LowTerm     1354.18      (4.9%)     1325.24      (7.1%)   
-2.1% ( -13% -   10%)
                PKLookup      130.95      (2.3%)      128.48      (3.1%)   
-1.9% (  -7% -    3%)
                  Fuzzy1       49.16      (6.3%)       48.28      (9.8%)   
-1.8% ( -16% -   15%)
            OrHighNotMed      518.20      (4.8%)      509.19      (5.4%)   
-1.7% ( -11% -    8%)
                Wildcard       53.04      (2.1%)       52.14      (3.6%)   
-1.7% (  -7% -    4%)
               OrHighLow      241.89      (3.8%)      237.82      (4.7%)   
-1.7% (  -9% -    7%)
                 MedTerm     1002.39      (5.2%)      985.94      (5.8%)   
-1.6% ( -12% -    9%)
              AndHighLow      453.14      (3.3%)      445.71      (5.9%)   
-1.6% ( -10% -    7%)
       HighTermMonthSort       22.03     (10.5%)       21.70     (10.3%)   
-1.5% ( -20% -   21%)
               MedPhrase      228.73      (3.2%)      225.69      (4.6%)   
-1.3% (  -8% -    6%)
            OrHighNotLow      461.58      (4.8%)      456.17      (6.9%)   
-1.2% ( -12% -   11%)
    HighTermTitleBDVSort       64.10     (10.5%)       63.35      (9.7%)   
-1.2% ( -19% -   21%)
               LowPhrase      134.07      (2.1%)      132.72      (2.8%)   
-1.0% (  -5% -    3%)
               OrHighMed       56.83      (2.9%)       56.44      (3.7%)   
-0.7% (  -7% -    6%)
                 Prefix3       31.60      (2.1%)       31.42      (2.4%)   
-0.6% (  -4% -    4%)
              AndHighMed       61.02      (2.7%)       60.72      (3.2%)   
-0.5% (  -6% -    5%)
              HighPhrase       46.05      (3.2%)       45.84      (4.7%)   
-0.5% (  -8% -    7%)
             LowSpanNear       40.71      (2.4%)       40.54      (2.2%)   
-0.4% (  -4% -    4%)
                  Fuzzy2       37.09      (7.5%)       36.94      (6.0%)   
-0.4% ( -12% -   14%)
                 Respell       28.71      (1.5%)       28.60      (2.0%)   
-0.4% (  -3% -    3%)
            OrNotHighMed      375.44      (2.9%)      374.10      (5.0%)   
-0.4% (  -8% -    7%)
             AndHighHigh       32.52      (3.2%)       32.46      (3.3%)   
-0.2% (  -6% -    6%)
            HighSpanNear        9.05      (3.3%)        9.03      (3.6%)   
-0.2% (  -6% -    6%)
         MedSloppyPhrase       61.05      (1.8%)       61.03      (2.5%)   
-0.0% (  -4% -    4%)
BrowseDayOfYearSSDVFacets        2.76      (0.8%)        2.75      (1.0%)   
-0.0% (  -1% -    1%)
                HighTerm      917.59      (4.6%)      917.50      (6.4%)   
-0.0% ( -10% -   11%)
    HighIntervalsOrdered        4.32      (2.5%)        4.32      (2.8%)    
0.0% (  -5% -    5%)
   BrowseMonthSSDVFacets        2.94      (0.6%)        2.95      (0.6%)    
0.1% (  -1% -    1%)
              OrHighHigh        6.22      (3.0%)        6.24      (3.3%)    
0.2% (  -5% -    6%)
           OrHighNotHigh      425.45      (5.3%)      426.34      (6.8%)    
0.2% ( -11% -   13%)
         LowSloppyPhrase        7.30      (2.6%)        7.32      (2.5%)    
0.4% (  -4% -    5%)
        HighSloppyPhrase        7.52      (2.1%)        7.55      (2.8%)    
0.4% (  -4% -    5%)
             MedSpanNear       11.22      (3.1%)       11.26      (3.0%)    
0.4% (  -5% -    6%)
            OrNotHighLow      367.16      (3.6%)      368.76      (4.8%)    
0.4% (  -7% -    9%)
   HighTermDayOfYearSort       35.05      (8.3%)       35.47      (9.1%)    
1.2% ( -14% -   20%)
   BrowseMonthTaxoFacets        1.39      (2.6%)        1.44      (7.3%)    
4.0% (  -5% -   14%)
    BrowseDateTaxoFacets        1.21      (2.4%)        1.26      (6.7%)    
4.2% (  -4% -   13%)
BrowseDayOfYearTaxoFacets        1.20      (2.8%)        1.26      (6.9%)    
4.7% (  -4% -   14%)

{code}
 

 

 

 

 

 

> Taxonomy index should use DocValues not StoredFields
> ----------------------------------------------------
>
>                 Key: LUCENE-9450
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9450
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>    Affects Versions: 8.5.2
>            Reporter: Gautam Worah
>            Priority: Minor
>              Labels: performance
>         Attachments: LUCENE-9450-localrun.py-v1, wip_taxonomy_patch
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The taxonomy index that maps binning labels to ordinals was created before 
> Lucene added BinaryDocValues.
> I've attached a WIP patch (does not pass tests currently)
> Issue suggested by [~mikemccand]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to