David Arthur created KAFKA-14178:
------------------------------------
Summary: NoOpRecord incorrectly causes high controller queue time
metric
Key: KAFKA-14178
URL: https://issues.apache.org/jira/browse/KAFKA-14178
Project: Kafka
Issue Type: Bug
Components: controller, kraft, metrics
Reporter: David Arthur
Fix For: 3.3.0
When a deferred event is added to the queue in ControllerQuorum, we include the
total time it sat in the queue as part of the "EventQueueTimeMs" metric in
QuorumControllerMetrics.
With the introduction of NoOpRecords, the p99 value for this metric is equal to
the frequency that we schedule the no-op records. E.g., if no-op records are
scheduled every 5 seconds, we will see p99 EventQueueTimeMs of 5 seconds.
This makes it difficult (impossible) to see if there is some delay in the event
processing on the controller.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)