junrao commented on a change in pull request #8935:
URL: https://github.com/apache/kafka/pull/8935#discussion_r454018584



##########
File path: core/src/main/scala/kafka/controller/ControllerEventManager.scala
##########
@@ -139,4 +140,19 @@ class ControllerEventManager(controllerId: Int,
     }
   }
 
+  private def pollFromEventQueue(): QueuedEvent = {
+    val count = eventQueueTimeHist.count()

Review comment:
       Hmm, if eventQueueTimeHist.count() is not 0, it seems that we can block 
on queue.take() forever without resetting the histogram?

##########
File path: core/src/main/scala/kafka/controller/ControllerEventManager.scala
##########
@@ -139,4 +140,19 @@ class ControllerEventManager(controllerId: Int,
     }
   }
 
+  private def pollFromEventQueue(): QueuedEvent = {

Review comment:
       This approach seems fine for this particular usage. There are other 
usage of histogram that all have the same issue. A more complete fix would be 
to fix the issue at the Histogram level. Could we try patching the Yammer code 
upstream or at least file an issue? I am thinking that we could track the last 
time a histogram is updated, if it passes a certain amount of time, we reset on 
next get() or update().

##########
File path: core/src/main/scala/kafka/controller/ControllerEventManager.scala
##########
@@ -69,7 +69,8 @@ class QueuedEvent(val event: ControllerEvent,
 class ControllerEventManager(controllerId: Int,
                              processor: ControllerEventProcessor,
                              time: Time,
-                             rateAndTimeMetrics: Map[ControllerState, 
KafkaTimer]) extends KafkaMetricsGroup {
+                             rateAndTimeMetrics: Map[ControllerState, 
KafkaTimer],
+                             eventQueueTimeTimeoutMs: Long = 60000) extends 
KafkaMetricsGroup {

Review comment:
       The exact value probably depends on the metric collecting interval. 
Perhaps we could be a bit conservative with a 5 min timeout?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to