[ 
https://issues.apache.org/jira/browse/LUCENE-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468009#comment-17468009
 ] 

ASF subversion and git services commented on LUCENE-10346:
----------------------------------------------------------

Commit 01f5e7bb7b4f5efb5330ab97896008c83daef657 in lucene's branch 
refs/heads/branch_9x from gf2121
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=01f5e7b ]

LUCENE-10346: Specially treat SingletonSortedNumericDocValues in 
FastTaxonomyFacetCounts#countAll() (#574)


> Specially treat SingletonSortedNumericDocValues in 
> FastTaxonomyFacetCounts#countAll()
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-10346
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10346
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Feng Guo
>            Priority: Minor
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> CPU profile often tells {{SingletonSortedNumericDocValues#nextDoc()}} is 
> using a high percentage of CPU when running luceneutil, but the {{nextDoc()}} 
> of dense cases should be rather simple. So I suspect that it is too many 
> layers of abstraction (and wrap) that cause the stress of JVM. Unwraping it 
> to {{NumericDocvalues}} shows around 30% speed up.
> {code:java}
>                             TaskQPS baseline      StdDevQPS 
> my_modified_version      StdDev                Pct diff p-value
>             HighTermTitleBDVSort      132.24     (20.6%)      125.67      
> (9.9%)   -5.0% ( -29% -   32%) 0.330
>                          LowTerm     1424.13      (3.2%)     1381.34      
> (4.4%)   -3.0% ( -10% -    4%) 0.014
>                    OrHighNotHigh      707.82      (3.3%)      687.49      
> (6.0%)   -2.9% ( -11% -    6%) 0.062
>                       TermDTSort      155.32     (10.9%)      151.02     
> (10.2%)   -2.8% ( -21% -   20%) 0.406
>                     OrNotHighMed      618.46      (3.7%)      602.65      
> (4.4%)   -2.6% ( -10% -    5%) 0.047
>                           Fuzzy1       76.22      (5.3%)       74.71      
> (6.6%)   -2.0% ( -13% -   10%) 0.293
>                HighTermMonthSort      174.89     (10.4%)      171.45     
> (10.6%)   -2.0% ( -20% -   21%) 0.554
>                     OrHighNotMed      776.08      (4.9%)      761.70      
> (7.8%)   -1.9% ( -13% -   11%) 0.367
>            HighTermDayOfYearSort       56.23     (10.7%)       55.26     
> (10.9%)   -1.7% ( -21% -   22%) 0.615
>                          MedTerm     1449.48      (3.7%)     1425.87      
> (5.1%)   -1.6% ( -10% -    7%) 0.250
>                    OrNotHighHigh      687.92      (4.9%)      677.06      
> (5.5%)   -1.6% ( -11% -    9%) 0.339
>                     OrHighNotLow      742.99      (4.7%)      732.23      
> (5.9%)   -1.4% ( -11% -    9%) 0.390
>                     OrNotHighLow      789.37      (2.7%)      778.80      
> (4.7%)   -1.3% (  -8% -    6%) 0.270
>                       HighPhrase       75.84      (2.2%)       75.14      
> (3.0%)   -0.9% (  -6% -    4%) 0.269
>                 HighSloppyPhrase       20.71      (5.9%)       20.56      
> (5.2%)   -0.7% ( -11% -   11%) 0.678
>                           IntNRQ      106.38     (18.4%)      105.67     
> (18.2%)   -0.7% ( -31% -   44%) 0.908
>                        OrHighMed       45.10      (1.5%)       44.83      
> (1.8%)   -0.6% (  -3% -    2%) 0.261
>                      MedSpanNear      192.49      (2.5%)      191.51      
> (3.5%)   -0.5% (  -6% -    5%) 0.593
>                        OrHighLow      489.82      (5.5%)      487.79      
> (5.7%)   -0.4% ( -11% -   11%) 0.815
>                  MedSloppyPhrase       27.33      (2.9%)       27.22      
> (2.3%)   -0.4% (  -5% -    5%) 0.623
>                        MedPhrase      208.94      (2.9%)      208.09      
> (3.7%)   -0.4% (  -6% -    6%) 0.696
>                          Respell       71.84      (2.4%)       71.55      
> (2.4%)   -0.4% (  -5% -    4%) 0.600
>                       OrHighHigh       36.26      (1.3%)       36.13      
> (1.1%)   -0.4% (  -2% -    2%) 0.344
>            BrowseMonthSSDVFacets       15.95      (2.7%)       15.90      
> (2.5%)   -0.4% (  -5% -    5%) 0.672
>                       AndHighMed       85.83      (2.2%)       85.53      
> (2.7%)   -0.3% (  -5% -    4%) 0.658
>                          Prefix3      123.15      (2.6%)      122.74      
> (2.5%)   -0.3% (  -5% -    4%) 0.678
>                           Fuzzy2       76.41      (4.7%)       76.23      
> (4.2%)   -0.2% (  -8% -    9%) 0.867
>        BrowseDayOfYearSSDVFacets       14.52      (2.4%)       14.49      
> (2.2%)   -0.2% (  -4% -    4%) 0.747
>              MedIntervalsOrdered       56.39      (4.2%)       56.27      
> (4.1%)   -0.2% (  -8% -    8%) 0.871
>             HighIntervalsOrdered        9.29      (4.7%)        9.27      
> (4.4%)   -0.2% (  -8% -    9%) 0.896
>          AndHighMedDayTaxoFacets      119.76      (2.5%)      119.53      
> (2.9%)   -0.2% (  -5% -    5%) 0.831
>                     HighSpanNear       20.89      (2.0%)       20.85      
> (2.3%)   -0.2% (  -4% -    4%) 0.803
>              LowIntervalsOrdered       45.51      (4.9%)       45.47      
> (4.8%)   -0.1% (  -9% -   10%) 0.952
>                        LowPhrase       64.17      (2.6%)       64.14      
> (2.6%)   -0.1% (  -5% -    5%) 0.951
>                      LowSpanNear      104.45      (2.2%)      104.41      
> (1.9%)   -0.0% (  -4% -    4%) 0.959
>                         Wildcard      103.83      (2.8%)      103.80      
> (2.8%)   -0.0% (  -5% -    5%) 0.970
>                      AndHighHigh       42.33      (2.6%)       42.33      
> (2.4%)   -0.0% (  -4% -    5%) 0.991
>      BrowseRandomLabelSSDVFacets       10.62      (2.5%)       10.62      
> (1.8%)    0.0% (  -4% -    4%) 0.981
>         AndHighHighDayTaxoFacets       29.75      (2.3%)       29.76      
> (2.7%)    0.1% (  -4% -    5%) 0.949
>             MedTermDayTaxoFacets       26.56      (3.0%)       26.58      
> (2.5%)    0.1% (  -5% -    5%) 0.945
>                       AndHighLow     1012.26      (4.5%)     1013.62      
> (4.3%)    0.1% (  -8% -    9%) 0.923
>                  LowSloppyPhrase       78.82      (6.8%)       79.03      
> (6.0%)    0.3% ( -11% -   14%) 0.897
>                         PKLookup      204.09      (3.0%)      204.82      
> (2.9%)    0.4% (  -5% -    6%) 0.703
>           OrHighMedDayTaxoFacets       14.53      (3.4%)       14.59      
> (2.7%)    0.4% (  -5% -    6%) 0.694
>                         HighTerm     1607.26      (5.2%)     1623.99      
> (5.6%)    1.0% (  -9% -   12%) 0.543
>      BrowseRandomLabelTaxoFacets       11.93      (6.9%)       15.52      
> (2.5%)   30.1% (  19% -   42%) 0.000
>             BrowseDateTaxoFacets       13.46      (9.0%)       18.28      
> (3.6%)   35.8% (  21% -   53%) 0.000
>        BrowseDayOfYearTaxoFacets       13.59      (9.1%)       18.53      
> (3.6%)   36.3% (  21% -   53%) 0.000
>            BrowseMonthTaxoFacets       13.93     (10.9%)       19.70     
> (14.9%)   41.4% (  14% -   75%) 0.000
> {code}
> *Baseline*
> {code:java}
> PERCENT       CPU SAMPLES   STACK
> 3.85%         12316         
> org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
> 3.78%         12076         
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get()
> 3.72%         11905         
> org.apache.lucene.index.SingletonSortedNumericDocValues#nextDoc()
> 2.88%         9199          
> org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()
> 2.31%         7380          
> org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
> 2.27%         7270          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue()
> 2.25%         7211          
> org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#increment()
> 2.23%         7139          
> org.apache.lucene.index.SingletonSortedNumericDocValues#nextValue()
> 1.88%         6006          java.nio.Buffer#checkIndex()
> 1.86%         5965          jdk.internal.misc.Unsafe#convEndian()
> 1.85%         5916          
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get()
> 1.72%         5491          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#nextPosition()
> 1.49%         4780          java.nio.DirectByteBuffer#ix()
> 1.42%         4548          java.nio.Buffer#scope()
> 1.40%         4465          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$4#longValue()
> 1.39%         4434          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#advance()
> 1.33%         4254          
> org.apache.lucene.store.ByteBufferGuard#ensureValid()
> 1.32%         4219          
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
> 1.28%         4109          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#nextDoc()
> 1.28%         4089          
> jdk.internal.misc.ScopedMemoryAccess#getByteInternal()
> 1.16%         3709          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#advance()
> 1.10%         3517          org.apache.lucene.store.ByteBufferGuard#getInt()
> 1.07%         3427          
> org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue()
> 0.98%         3149          org.apache.lucene.search.ConjunctionDISI#doNext()
> 0.98%         3120          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater()
> 0.93%         2969          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$3#longValue()
> 0.92%         2927          org.apache.lucene.store.ByteBufferGuard#getByte()
> 0.88%         2828          com.carrotsearch.hppc.IntIntHashMap#indexOf()
> 0.82%         2635          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsDocsEnum#advance()
> 0.82%         2633          
> org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()
> {code}
> *Candidate*
> {code:java}
> PERCENT       CPU SAMPLES   STACK
> 4.15%         12823         
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get()
> 3.94%         12186         
> org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
> 3.32%         10266         
> org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll()
> 2.98%         9208          
> org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()
> 2.38%         7351          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue()
> 2.07%         6386          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$DenseNumericDocValues#nextDoc()
> 1.85%         5723          
> org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#increment()
> 1.81%         5600          jdk.internal.misc.Unsafe#convEndian()
> 1.81%         5588          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#nextPosition()
> 1.75%         5409          java.nio.Buffer#checkIndex()
> 1.72%         5310          
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get()
> 1.50%         4631          java.nio.Buffer#scope()
> 1.44%         4437          
> jdk.internal.misc.ScopedMemoryAccess#getByteInternal()
> 1.43%         4408          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#advance()
> 1.39%         4297          java.nio.DirectByteBuffer#ix()
> 1.39%         4280          
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
> 1.33%         4111          
> org.apache.lucene.store.ByteBufferGuard#ensureValid()
> 1.31%         4052          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#nextDoc()
> 1.29%         3974          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$4#longValue()
> 1.22%         3761          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#advance()
> 1.13%         3502          
> org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue()
> 1.04%         3219          org.apache.lucene.search.ConjunctionDISI#doNext()
> 1.00%         3099          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater()
> 0.99%         3067          
> org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$3#longValue()
> 0.99%         3052          org.apache.lucene.store.ByteBufferGuard#getInt()
> 0.89%         2762          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsDocsEnum#advance()
> 0.87%         2690          
> org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()
> 0.86%         2663          org.apache.lucene.store.ByteBufferGuard#getByte()
> 0.80%         2476          
> org.apache.lucene.codecs.lucene90.ForUtil#expand8()
> 0.78%         2420          
> org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#skipPositions()
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to