[ https://issues.apache.org/jira/browse/CASSANDRA-19770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brandon Williams updated CASSANDRA-19770: ----------------------------------------- Bug Category: Parent values: Correctness(12982) Complexity: Normal Discovered By: User Report Fix Version/s: 4.1.x 5.0.x 5.x Severity: Normal Status: Open (was: Triage Needed) [~yifanc] can you take a look? > Incorrect latency metrics reported by metric-reporter > ----------------------------------------------------- > > Key: CASSANDRA-19770 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19770 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics > Reporter: Aswin Karthik > Priority: Normal > Fix For: 4.1.x, 5.0.x, 5.x > > > Cassandra version: 4.1.5 > Since [CASSANDRA-16760|https://issues.apache.org/jira/browse/CASSANDRA-16760] > and [these > changes|https://github.com/apache/cassandra/pull/1091/files#diff-07f330b65d5335967ea96f80674b25415c70994d99b97795ed4db696c92b3ff5L532], > the metric reporter is dividing the microseconds metrics by 10^6 and > reporting it as milliseconds unit (it should be divided by 10^3). This means > an additional division of 10^3 happens causing the metrics to be wrong. > The sample configuration or documentation does not include how to configure > the metrics reporter to report it correctly. > Steps to reproduce: > Contents of metrics-reporter-config-sample.yaml > {noformat} > console: > - > outfile: '/tmp/metrics.out' > period: 10 > timeunit: 'SECONDS' > predicate: > color: "white" > useQualifiedName: true > patterns: > - "^org.apache.cassandra.metrics.ClientRequest.+" # includes > ClientRequestMetrics > {noformat} > Cassandra started with flag > {noformat} > -Dcassandra.metricsReporterConfigFile=metrics-reporter-config-sample.yaml > {noformat} > Run cassandra-stress to generate load > {noformat} > tools/bin/cassandra-stress write duration=1m cl=ONE -rate threads=1000 > {noformat} > Post that > If you check via nodetool > {noformat} > bin/nodetool sjk mxdump -q > org.apache.cassandra.metrics:type=ClientRequest,scope=Write-ONE,name=Latency > { > "beans" : [ { > "name" : > "org.apache.cassandra.metrics:type=ClientRequest,scope=Write-ONE,name=Latency", > "modelerType" : > "org.apache.cassandra.metrics.CassandraMetricsRegistry$JmxTimer", > "Max" : 654949.0, > "999thPercentile" : 11864.0, > "DurationUnit" : "microseconds", > .... > } ] > } > {noformat} > The max is 654949.0 micros which 654 millis. > However, the metric reporter emits 0.65 millis because of the division of > additional 10^3 factor > {noformat} > ❯ tail -n100 /tmp/metrics.out | grep -A 20 Latency.Write-ONE > org.apache.cassandra.metrics.ClientRequest.Latency.Write-ONE > count = 17053398 > max = 0.65 milliseconds > 99.9% <= 0.01 milliseconds > ... > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org