rondagostino commented on code in PR #13759:
URL: https://github.com/apache/kafka/pull/13759#discussion_r1206028552


##########
metadata/src/main/java/org/apache/kafka/controller/BrokerHeartbeatManager.java:
##########
@@ -223,6 +283,21 @@ public BrokerHeartbeatState next() {
         }
     }
 
+    /**
+     * The maximum number of timed out heartbeats to count.
+     */
+    static final int DEFAULT_TIMED_OUT_HEARTBEAT_COUNT_MAX = 1000;
+
+    /**
+     * The time period over which to track timed out heartbeats.
+     */
+    static final long DEFAULT_TIMED_OUT_HEARTBEAT_COUNT_WINDOW_NS = 
TimeUnit.MINUTES.toNanos(5);
+
+    /**
+     * The number of heartbeats to notice missing before we go into overload.
+     */
+    static final int DEFAULT_TIMED_OUT_HEARTBEAT_OVERLOAD_THRESHOLD = 3;
+

Review Comment:
   Whatever we set this to (and currently we always take this default value) 
will determine how long it would take for a controller to leave the overloaded 
state, right?  So if we see a series of missed heartbeats that puts us into the 
overloaded state, and then we don't see any more missed heartbeats, it would 
take this amount of time to leave that overloaded state?
   
   Consider the following case.  We are overloaded.  And then some broker 
crashes.  How long would we want it to take before we move leadership away for 
any partitions led by that suddenly-dead broker?  Ideally the session timeout, 
of course, but that won't happen if we are overloaded -- will it take 5 minutes 
as currently coded? That seems like a long time.
   
   I think the only thing we can say is that we should move leadership away as 
soon as we have a correct view of the cluster.  The question then becomes: how 
do we know when we have good information?
   
   A fixed amount of time like this with no missed heartbeats -- that seems too 
simple.  We likely have a reasonable view of the cluster if we see enough 
heartbeats without missing any such that it is likely that all brokers have had 
a chance to communicate with us.  So it seems like maybe the broker heartbeat 
interval should factor into it somehow?
   
   Let's say the broker heartbeat interval is 2 seconds.  Right now we consider 
a heartbeat missed if it is more than half the heartbeat interval (i.e. 1 
second) old.  Let's say there are 60 brokers in the cluster.  If we see 
something on the order of 60 heartbeats in row without having missing any, then 
we probably have a reasonable view.
   
   So maybe it isn't having missed no more than N heartbeats in some fixed time 
window.  Maybe we define "not overloaded" as having seen some string of 
heartbeats uninterrupted without seeing a missed one?  So every time we miss a 
heartbeat we go into the "overloaded" state and we reset a counter of 
contiguous successfully processed heartbeats to 0.  Whenever we see a heartbeat 
and process it in time we increase that counter.  We are no longer overloaded 
when the `contiguousHeartbeatsProcessedInTime` counter is equal to or exceeds 
the broker count.
   
   I'm not sure I ave it right, but I think this is a discussion worth pursuing.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to