[ https://issues.apache.org/jira/browse/KAFKA-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax reassigned KAFKA-7240: -------------------------------------- Assignee: John Roesler > -total metrics in Streams are incorrect > --------------------------------------- > > Key: KAFKA-7240 > URL: https://issues.apache.org/jira/browse/KAFKA-7240 > Project: Kafka > Issue Type: Bug > Components: metrics, streams > Affects Versions: 2.0.0 > Reporter: Sam Lendle > Assignee: John Roesler > Priority: Major > > I noticed the values of total metrics for streams were decreasing > periodically when viewed in JMX, for example process-total for each > processor-node-id under stream-processor-node-metrics. > Edit: For processor node metrics, I should have been looking at > ProcessorNode, not StreamsMetricsThreadImpl. > -Looking at StreamsMetricsThreadImpl, I believe this behavior is due to > using Count() as the Stat for the *-total metrics. Count() is a SampledStat, > so the value it reports is the count in recent time windows, and the value > decreases whenever a window is purged.- > ---- > -This explains the behavior I saw, but I think the issue is deeper. For > example, processTimeSensor attempts to measure, process-latency-avg, > process-latency-max, process-rate, and process-total. For that sensor, record > is called like- > -streamsMetrics.processTimeSensor.record(computeLatency() / (double) > processed, timerStartedMs);- > -so the value passed to record is average latency per processed message in > this batch if I understand correctly. That gets pushed through to the call to > Count#record, which increments it's count by 1, ignoring the value parameter. > Whatever stat is recording the total would need to know is the number of > messages processed. Because of that, I don't think it's possible for one > Sensor to measure both latency and total.- > -That said, it's not clear to me how all the different Stats work and how > exactly Sensors work, and I don't actually understand how the process-rate > metric is working for similar reasons but that seems to be correct, so I may > be missing something here.- > > cc [~guozhang] -- This message was sent by Atlassian JIRA (v7.6.3#76005)