[
https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865236#comment-17865236
]
Chris M. Hostetter commented on SOLR-10654:
-------------------------------------------
{quote}These tests must be failing depending on environmen or type of JVM being
used as I am unable to reproduce this on my end. Do you have more information
on how the Jenkins builds are being run?
{quote}
I don't, but even if i/you did "fixing" the test to pass on jenkins shouldn't
be the goal – you'd still be leaving a time bomb for some future dev whose JVM
doesn't exactly match yours....
The bottom line is that every tests we have should be able pass on *ANY* valid
JVM – tests that are known to be problematic on _specific_ JVM implementations
(like IBM's OpenJ9) should have specific {{assume()}} calls to prevent them
from running.
If test assertions are based on what metrics are available, they should _only_
look at metrics that are guaranteed to be available on every JVM, and not make
assumptions about what "extra" metrics may or may not exist on _some_ JVMs (and
may or may not have values that differ from what you see on *your* JVM)
----
In the case of {{TestPrometheusResponseWriter}} – it's pretty easy to reproduce
the problem on _any_ JVM by running multiple "jetty" based test in the same
Test JVM – the jetty dispatches/request metrics that aren't accounted for in
your hardcoded "expected" value seem to get created on the fly once there are
some requests (and they seem to stick around beyond the lifecycle of the tests
– (i suspect they are gauges which are known to have problematic lifecycles,
but i haven't checked) ...
{noformat}
$ ./gradlew test -Ptests.jvms=1 -Ptests.verbose=true --tests
TestPrometheusResponseWriter --tests \*Jetty\*
...
org.apache.solr.response.TestPrometheusResponseWriter > testPrometheusOutput
FAILED
org.junit.ComparisonFailure: expected:<... TYPE solr_metrics_j[vm_buffers
gauge
solr_metrics_jvm_buffers{item="Count",pool="direct"}
solr_metrics_jvm_buffers{item="Count",pool="mapped"}
solr_metrics_jvm_buffers{item="Count",pool="mapped - 'non-volatile memory'"}
# TYPE solr_metrics_jvm_buffers_bytes gauge
solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="direct"}
solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped"}
solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped -
'non-volatile memory'"}
solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="direct"}
solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped"}
solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped -
'non-volatile memory'"}
# TYPE solr_metrics_jvm_gc gauge
solr_metrics_jvm_gc{}
solr_metrics_jvm_gc{}
# TYPE solr_metrics_jvm_gc_seconds gauge
solr_metrics_jvm_gc_seconds{}
solr_metrics_jvm_gc_seconds{}
# TYPE solr_metrics_jvm_heap gauge
solr_metrics_jvm_heap{item="committed",memory="heap"}
solr_metrics_jvm_heap{item="committed",memory="non-heap"}
solr_metrics_jvm_heap{item="committed",memory="total"}
solr_metrics_jvm_heap{item="init",memory="heap"}
solr_metrics_jvm_heap{item="init",memory="non-heap"}
solr_metrics_jvm_heap{item="init",memory="total"}
solr_metrics_jvm_heap{item="max",memory="heap"}
solr_metrics_jvm_heap{item="max",memory="non-heap"}
solr_metrics_jvm_heap{item="max",memory="total"}
solr_metrics_jvm_heap{item="usage",memory="heap"}
solr_metrics_jvm_heap{item="usage",memory="non-heap"}
solr_metrics_jvm_heap{item="used",memory="heap"}
solr_metrics_jvm_heap{item="used",memory="non-heap"}
solr_metrics_jvm_heap{item="used",memory="total"}
# TYPE solr_metrics_jvm_memory_pools_bytes gauge
solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeCache]"}
solr_metrics_jvm_...> but was:<... TYPE
solr_metrics_j[etty_dispatches_total counter
solr_metrics_jetty_dispatches_total 0.0
# TYPE solr_metrics_jetty_requests_total counter
solr_metrics_jetty_requests_total{method="active"}
# TYPE solr_metrics_jetty_response_total counter
solr_metrics_jetty_response_total{status="2xx"}
# TYPE solr_metrics_jvm_buffers gauge
solr_metrics_jvm_buffers{item="Count",pool="direct"}
solr_metrics_jvm_buffers{item="Count",pool="mapped"}
solr_metrics_jvm_buffers{item="Count",pool="mapped - 'non-volatile memory'"}
# TYPE solr_metrics_jvm_buffers_bytes gauge
solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="direct"}
solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped"}
solr_metrics_jvm_buffers_bytes{item="MemoryUsed",pool="mapped -
'non-volatile memory'"}
solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="direct"}
solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped"}
solr_metrics_jvm_buffers_bytes{item="TotalCapacity",pool="mapped -
'non-volatile memory'"}
# TYPE solr_metrics_jvm_gc gauge
solr_metrics_jvm_gc{}
solr_metrics_jvm_gc{}
solr_metrics_jvm_gc{}
# TYPE solr_metrics_jvm_gc_seconds gauge
solr_metrics_jvm_gc_seconds{}
solr_metrics_jvm_gc_seconds{}
solr_metrics_jvm_gc_seconds{}
# TYPE solr_metrics_jvm_heap gauge
solr_metrics_jvm_heap{item="committed",memory="heap"}
solr_metrics_jvm_heap{item="committed",memory="non-heap"}
solr_metrics_jvm_heap{item="committed",memory="total"}
solr_metrics_jvm_heap{item="init",memory="heap"}
solr_metrics_jvm_heap{item="init",memory="non-heap"}
solr_metrics_jvm_heap{item="init",memory="total"}
solr_metrics_jvm_heap{item="max",memory="heap"}
solr_metrics_jvm_heap{item="max",memory="non-heap"}
solr_metrics_jvm_heap{item="max",memory="total"}
solr_metrics_jvm_heap{item="usage",memory="heap"}
solr_metrics_jvm_heap{item="usage",memory="non-heap"}
solr_metrics_jvm_heap{item="used",memory="heap"}
solr_metrics_jvm_heap{item="used",memory="non-heap"}
solr_metrics_jvm_heap{item="used",memory="total"}
# TYPE solr_metrics_jvm_memory_pools_bytes gauge
solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeCache"}
solr_metrics_jvm_memory_pools_bytes{item="committed",space="CodeHeap-'non-nmethods']"}
solr_metrics_jvm_...>
at
__randomizedtesting.SeedInfo.seed([CB2ACA8B48AFD85A:4A3AA7A2310A37AC]:0)
at org.junit.Assert.assertEquals(Assert.java:117)
at org.junit.Assert.assertEquals(Assert.java:146)
at
org.apache.solr.response.TestPrometheusResponseWriter.testPrometheusOutput(TestPrometheusResponseWriter.java:86)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
at
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at
org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
at
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at
org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
at
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:80)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at
org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at
org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at
org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at
org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
at java.base/java.lang.Thread.run(Thread.java:829)
2> NOTE: reproduce with: gradlew test --tests
TestPrometheusResponseWriter.testPrometheusOutput -Dtests.seed=CB2ACA8B48AFD85A
-Dtests.locale=sw-UG -Dtests.timezone=America/Barbados -Dtests.asserts=true
-Dtests.file.encoding=UTF-8
2> 17677 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.a.s.u.ErrorLogMuter Closing ErrorLogMuter-regex-5 after mutting 0
log messages
2> 17677 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.a.s.u.ErrorLogMuter Creating ErrorLogMuter-regex-6 for ERROR logs
matching regex: ignore_exception
2> 17679 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.e.j.s.Server Stopped Server@6a2e51da{STOPPING}[10.0.20,sto=0]
2> 17679 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.e.j.s.AbstractConnector Stopped ServerConnector@574f4b87{HTTP/1.1,
(http/1.1, h2c)}{127.0.0.1:0}
2> 17683 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.a.s.c.CoreContainer Shutting down CoreContainer instance=2035622883
2> 17683 INFO (coreCloseExecutor-139-thread-1) [n: c: s: r: x:collection1
t:] o.a.s.c.SolrCore CLOSING SolrCore org.apache.solr.core.SolrCore@7125b2ee
collection1
2> 17684 INFO (coreCloseExecutor-139-thread-1) [n: c: s: r: x:collection1
t:] o.a.s.m.SolrMetricManager Closing metric reporters for
registry=solr.core.collection1 tag=SolrCore@7125b2ee
2> 17706 INFO (coreCloseExecutor-139-thread-1) [n: c: s: r: x:collection1
t:] o.a.s.u.DirectUpdateHandler2 Committing on IndexWriter.close() ... SKIPPED
(unnecessary).
2> 17708 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.a.s.m.SolrMetricManager Closing metric reporters for
registry=solr.node tag=null
2> 17718 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.a.s.m.SolrMetricManager Closing metric reporters for
registry=solr.jvm tag=null
2> 17724 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.a.s.m.SolrMetricManager Closing metric reporters for
registry=solr.jetty tag=null
2> 17725 INFO
(SUITE-TestPrometheusResponseWriter-seed#[CB2ACA8B48AFD85A]-worker) [n: c: s:
r: x: t:] o.e.j.s.h.ContextHandler Stopped
o.e.j.s.ServletContextHandler@6c9140ae{/solr,file:///home/hossman/lucene/solr/solr/core/build/tmp/tests-cwd/,STOPPED}
2> NOTE: leaving temporary files on disk at:
/home/hossman/lucene/solr/solr/core/build/tmp/tests-tmp/org.apache.solr.response.TestPrometheusResponseWriter_CB2ACA8B48AFD85A-001
2> NOTE: test params are: codec=Asserting(Lucene99): {}, docValues:{},
maxPointsInLeafNode=564, maxMBSortInHeap=7.22533435015882,
sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=sw-UG,
timezone=America/Barbados
2> NOTE: Linux 5.4.0-150-generic amd64/Eclipse Adoptium 11.0.15
(64-bit)/cpus=1,threads=1,free=197371352,total=324009984
2> NOTE: All tests run in this JVM: [TestWaitForStateWithJettyShutdowns,
MetricsHandlerTest, TestPrometheusResponseWriter]
{noformat}
> Expose Metrics in Prometheus format DIRECTLY from Solr
> ------------------------------------------------------
>
> Key: SOLR-10654
> URL: https://issues.apache.org/jira/browse/SOLR-10654
> Project: Solr
> Issue Type: Improvement
> Components: metrics
> Reporter: Keith Laban
> Priority: Major
> Attachments: prometheus_metrics.txt
>
> Time Spent: 7h 20m
> Remaining Estimate: 0h
>
> Expose metrics via a `wt=prometheus` response type.
> Example scape_config in prometheus.yml:
> {code:java}
> scrape_configs:
> - job_name: 'solr'
> metrics_path: '/solr/admin/metrics'
> params:
> wt: ["prometheus"]
> static_configs:
> - targets: ['localhost:8983']
> {code}
> [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423]
> for having this despite the "Prometheus Exporter". They have different
> strengths and weaknesses.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]