[ 
https://issues.apache.org/jira/browse/SOLR-16273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman updated SOLR-16273:
----------------------------------
    Fix Version/s: 9.3
                       (was: 9.4)
                       (was: 9.3.1)

> Prometheus Metric Exporter is very slow when collecting large amounts of 
> sample data
> ------------------------------------------------------------------------------------
>
>                 Key: SOLR-16273
>                 URL: https://issues.apache.org/jira/browse/SOLR-16273
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - prometheus-exporter
>    Affects Versions: 8.6.3, 9.0
>            Reporter: Fa Ming
>            Priority: Critical
>             Fix For: 9.3
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> I have a solr cluster with 300 Collections, use Prometheus Metric Exporter 
> program to get solr-cluster information, but it takes 2 minutes to get data 
> each time, `jstack` is as follows:
> {code:}
> "solr-exporter-collectors-1-thread-2" #21 prio=5 os_prio=0 
> tid=0x00007fcef8009000 nid=0x45208 runnable [0x00007fcf16470000]
>    java.lang.Thread.State: RUNNABLE
>     at 
> io.prometheus.client.Collector$MetricFamilySamples$Sample.equals(Collector.java:95)
>     at java.util.ArrayList.indexOf(ArrayList.java:323)
>     at java.util.ArrayList.contains(ArrayList.java:306)
>     at 
> org.apache.solr.prometheus.collector.MetricSamples.addSampleIfMetricExists(MetricSamples.java:50)
>     at 
> org.apache.solr.prometheus.collector.MetricSamples.addAll(MetricSamples.java:60)
>     at 
> org.apache.solr.prometheus.collector.MetricsCollector.lambda$collect$0(MetricsCollector.java:38)
>     at 
> org.apache.solr.prometheus.collector.MetricsCollector$$Lambda$127/68757342.accept(Unknown
>  Source)
>     at java.util.HashMap.forEach(HashMap.java:1291)
>     at 
> org.apache.solr.prometheus.collector.MetricsCollector.collect(MetricsCollector.java:38)
>     at 
> org.apache.solr.prometheus.collector.SchedulerMetricsCollector.lambda$collectMetrics$0(SchedulerMetricsCollector.java:91)
>     at 
> org.apache.solr.prometheus.collector.SchedulerMetricsCollector$$Lambda$75/817493591.get(Unknown
>  Source)
>     at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>     at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212)
>     at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$39/351002168.run(Unknown
>  Source)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750)
> {code}
>  
> {*}"contains" method takes 90% of execution time{*}.
>  
> Looking at the MetricSamples.java code, "sample" will be deduplicated before 
> adding to "sampleFamily.samples", when "sampleFamily.samples" reaches 20,000, 
> "sampleFamily.samples.contains" is very inefficient



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to