[ 
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865236#comment-17865236
 ] 

Chris M. Hostetter commented on SOLR-10654:
-------------------------------------------

{quote}These tests must be failing depending on environmen or type of JVM being 
used as I am unable to reproduce this on my end. Do you have more information 
on how the Jenkins builds are being run?
{quote}
I don't, but even if i/you did "fixing" the test to pass on jenkins shouldn't 
be the goal – you'd still be leaving a time bomb for some future dev whose JVM 
doesn't exactly match yours....

 

The bottom line is that every tests we have should be able pass on *ANY* valid 
JVM – tests that are known to be problematic on _specific_ JVM implementations 
(like IBM's OpenJ9) should have specific {{assume()}} calls to prevent them 
from running.

If test assertions are based on what metrics are available, they should _only_ 
look at metrics that are guaranteed to be available on every JVM, and not make 
assumptions about what "extra" metrics may or may not exist on _some_ JVMs (and 
may or may not have values that differ from what you see on *your* JVM)

 
----
 

In the case of {{TestPrometheusResponseWriter}} – it's pretty easy to reproduce 
the problem on _any_ JVM by running multiple "jetty" based test in the same 
Test JVM  – the jetty dispatches/request metrics that aren't accounted for in 
your hardcoded "expected" value seem to get created on the fly once there are 
some requests (and they seem to stick around beyond the lifecycle of the tests 
– (i suspect they are gauges which are known to have problematic lifecycles, 
but i haven't checked) ...

 
{noformat}
$ ./gradlew test -Ptests.jvms=1 -Ptests.verbose=true --tests 
TestPrometheusResponseWriter --tests \*Jetty\*
...
org.apache.solr.response.TestPrometheusResponseWriter > testPrometheusOutput 
FAILED
    org.junit.ComparisonFailure: expected:<... TYPE solr_metrics_j[vm_buffers 
gauge
    solr_metrics_jvm_buffers{item="Count",pool="direct"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped - 'non-volatile memory'"}
    # TYPE solr_metrics_jvm_buffers_bytes gauge
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped - 
'non-volatile memory'"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped - 
'non-volatile memory'"}
    # TYPE solr_metrics_jvm_gc gauge
    solr_metrics_jvm_gc{}
    solr_metrics_jvm_gc{}
    # TYPE solr_metrics_jvm_gc_seconds gauge
    solr_metrics_jvm_gc_seconds{}
    solr_metrics_jvm_gc_seconds{}
    # TYPE solr_metrics_jvm_heap gauge
    solr_metrics_jvm_heap{item="committed",memory="heap"}
    solr_metrics_jvm_heap{item="committed",memory="non-heap"}
    solr_metrics_jvm_heap{item="committed",memory="total"}
    solr_metrics_jvm_heap{item="init",memory="heap"}
    solr_metrics_jvm_heap{item="init",memory="non-heap"}
    solr_metrics_jvm_heap{item="init",memory="total"}
    solr_metrics_jvm_heap{item="max",memory="heap"}
    solr_metrics_jvm_heap{item="max",memory="non-heap"}
    solr_metrics_jvm_heap{item="max",memory="total"}
    solr_metrics_jvm_heap{item="usage",memory="heap"}
    solr_metrics_jvm_heap{item="usage",memory="non-heap"}
    solr_metrics_jvm_heap{item="used",memory="heap"}
    solr_metrics_jvm_heap{item="used",memory="non-heap"}
    solr_metrics_jvm_heap{item="used",memory="total"}
    # TYPE solr_metrics_jvm_memory_pools_bytes gauge
    solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeCache]"}
    solr_metrics_jvm_...> but was:<... TYPE 
solr_metrics_j[etty_dispatches_total counter
    solr_metrics_jetty_dispatches_total 0.0
    # TYPE solr_metrics_jetty_requests_total counter
    solr_metrics_jetty_requests_total{method="active"}
    # TYPE solr_metrics_jetty_response_total counter
    solr_metrics_jetty_response_total{status="2xx"}
    # TYPE solr_metrics_jvm_buffers gauge
    solr_metrics_jvm_buffers{item="Count",pool="direct"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped"}
    solr_metrics_jvm_buffers{item="Count",pool="mapped - 'non-volatile memory'"}
    # TYPE solr_metrics_jvm_buffers_bytes gauge
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped - 
'non-volatile memory'"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="direct"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped"}
    solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped - 
'non-volatile memory'"}
    # TYPE solr_metrics_jvm_gc gauge
    solr_metrics_jvm_gc{}
    solr_metrics_jvm_gc{}
    solr_metrics_jvm_gc{}
    # TYPE solr_metrics_jvm_gc_seconds gauge
    solr_metrics_jvm_gc_seconds{}
    solr_metrics_jvm_gc_seconds{}
    solr_metrics_jvm_gc_seconds{}
    # TYPE solr_metrics_jvm_heap gauge
    solr_metrics_jvm_heap{item="committed",memory="heap"}
    solr_metrics_jvm_heap{item="committed",memory="non-heap"}
    solr_metrics_jvm_heap{item="committed",memory="total"}
    solr_metrics_jvm_heap{item="init",memory="heap"}
    solr_metrics_jvm_heap{item="init",memory="non-heap"}
    solr_metrics_jvm_heap{item="init",memory="total"}
    solr_metrics_jvm_heap{item="max",memory="heap"}
    solr_metrics_jvm_heap{item="max",memory="non-heap"}
    solr_metrics_jvm_heap{item="max",memory="total"}
    solr_metrics_jvm_heap{item="usage",memory="heap"}
    solr_metrics_jvm_heap{item="usage",memory="non-heap"}
    solr_metrics_jvm_heap{item="used",memory="heap"}
    solr_metrics_jvm_heap{item="used",memory="non-heap"}
    solr_metrics_jvm_heap{item="used",memory="total"}
    # TYPE solr_metrics_jvm_memory_pools_bytes gauge
    solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeCache"}
    
solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeHeap-'non-nmethods']"}
    solr_metrics_jvm_...>
        at 
__randomizedtesting.SeedInfo.seed([CB2ACA8B48AFD85A:4A3AA7A2310A37AC]:0)
        at org.junit.Assert.assertEquals(Assert.java:117)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at 
org.apache.solr.response.TestPrometheusResponseWriter.testPrometheusOutput(TestPrometheusResponseWriter.java:86)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
        at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at 
org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
        at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at 
org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
        at 
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at 
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
        at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at 
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at 
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at 
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at 
org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
        at java.base/java.lang.Thread.run(Thread.java:829)
  2> NOTE: reproduce with: gradlew test --tests 
TestPrometheusResponseWriter.testPrometheusOutput -Dtests.seed=CB2ACA8B48AFD85A 
-Dtests.locale=sw-UG -Dtests.timezone=America/Barbados -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
  2> 17677 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.a.s.u.ErrorLogMuter Closing ErrorLogMuter-regex-5 after mutting 0 
log messages
  2> 17677 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.a.s.u.ErrorLogMuter Creating ErrorLogMuter-regex-6 for ERROR logs 
matching regex: ignore_exception
  2> 17679 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.e.j.s.Server Stopped Server@6a2e51da{STOPPING}[10.0.20,sto=0]
  2> 17679 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.e.j.s.AbstractConnector Stopped ServerConnector@574f4b87{HTTP/1.1, 
(http/1.1, h2c)}{127.0.0.1:0}
  2> 17683 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.a.s.c.CoreContainer Shutting down CoreContainer instance=2035622883
  2> 17683 INFO  (coreCloseExecutor-139-thread-1) [n: c: s: r: x:collection1 
t:] o.a.s.c.SolrCore CLOSING SolrCore org.apache.solr.core.SolrCore@7125b2ee 
collection1
  2> 17684 INFO  (coreCloseExecutor-139-thread-1) [n: c: s: r: x:collection1 
t:] o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.core.collection1 tag=SolrCore@7125b2ee
  2> 17706 INFO  (coreCloseExecutor-139-thread-1) [n: c: s: r: x:collection1 
t:] o.a.s.u.DirectUpdateHandler2 Committing on IndexWriter.close()  ... SKIPPED 
(unnecessary).
  2> 17708 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.node tag=null
  2> 17718 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.jvm tag=null
  2> 17724 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.a.s.m.SolrMetricManager Closing metric reporters for 
registry=solr.jetty tag=null
  2> 17725 INFO  
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s: 
r: x: t:] o.e.j.s.h.ContextHandler Stopped 
o.e.j.s.ServletContextHandler@6c9140ae{/solr,file:///home/hossman/lucene/solr/solr/core/build/tmp/tests-cwd/,STOPPED}
  2> NOTE: leaving temporary files on disk at: 
/home/hossman/lucene/solr/solr/core/build/tmp/tests-tmp/org.apache.solr.response.TestPrometheusResponseWriter_CB2ACA8B48AFD85A-001
  2> NOTE: test params are: codec=Asserting(Lucene99): {}, docValues:{}, 
maxPointsInLeafNode=564, maxMBSortInHeap=7.22533435015882, 
sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=sw-UG, 
timezone=America/Barbados
  2> NOTE: Linux 5.4.0-150-generic amd64/Eclipse Adoptium 11.0.15 
(64-bit)/cpus=1,threads=1,free=197371352,total=324009984
  2> NOTE: All tests run in this JVM: [TestWaitForStateWithJettyShutdowns, 
MetricsHandlerTest, TestPrometheusResponseWriter]
{noformat}
 

> Expose Metrics in Prometheus format DIRECTLY from Solr
> ------------------------------------------------------
>
>                 Key: SOLR-10654
>                 URL: https://issues.apache.org/jira/browse/SOLR-10654
>             Project: Solr
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Keith Laban
>            Priority: Major
>         Attachments: prometheus_metrics.txt
>
>          Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
>   - job_name: 'solr'
>     metrics_path: '/solr/admin/metrics'
>     params:
>       wt: ["prometheus"]
>     static_configs:
>       - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
>  for having this despite the "Prometheus Exporter".  They have different 
> strengths and weaknesses.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to