[ https://issues.apache.org/jira/browse/KAFKA-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551534#comment-17551534 ]
dengziming commented on KAFKA-13959: ------------------------------------ I haven't find the root cause, I just print the brokerOffset and controllerOffset when heartbeat, I find that every time the brokerOffset bump, the controllerOffset will also bump. ``` time: 1654679131904 broker 0 brokerOffset:27 controllerOffset:28 time: 1654679132115 broker 0 brokerOffset:27 controllerOffset:28 time: 1654679132381 broker 0 brokerOffset:28 controllerOffset:29 time: 1654679132592 broker 0 brokerOffset:28 controllerOffset:29 time: 1654679132878 broker 0 brokerOffset:29 controllerOffset:30 time: 1654679133089 broker 0 brokerOffset:29 controllerOffset:30 time: 1654679133299 broker 0 brokerOffset:30 controllerOffset:31 time: 1654679133509 broker 0 brokerOffset:30 controllerOffset:31 ``` I try to increase the interval of heartbeats but got the same result, and if I set numberControllerNodes to 1, this problem disappear. I think this may be related to the logic of how we compute leader hw and follower hw. > Controller should unfence Broker with busy metadata log > ------------------------------------------------------- > > Key: KAFKA-13959 > URL: https://issues.apache.org/jira/browse/KAFKA-13959 > Project: Kafka > Issue Type: Bug > Components: kraft > Affects Versions: 3.3.0 > Reporter: Jose Armando Garcia Sancio > Priority: Blocker > > https://issues.apache.org/jira/browse/KAFKA-13955 showed that it is possible > for the controller to not unfence a broker if the committed offset keeps > increasing. > > One solution to this problem is to require the broker to only catch up to the > last committed offset when they last sent the heartbeat. For example: > # Broker sends a heartbeat with current offset of {{{}Y{}}}. The last commit > offset is {{{}X{}}}. The controller remember this last commit offset, call it > {{X'}} > # Broker sends another heartbeat with current offset of {{{}Z{}}}. Unfence > the broker if {{Z >= X}} or {{{}Z >= X'{}}}. > > This change should also set the default for MetadataMaxIdleIntervalMs back to > 500. -- This message was sent by Atlassian Jira (v8.20.7#820007)