[ https://issues.apache.org/jira/browse/KAFKA-16620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gantigmaa Selenge reassigned KAFKA-16620: ----------------------------------------- Assignee: Gantigmaa Selenge > Kraft quorum cannot be formed if all controllers are restarted at the same > time > ------------------------------------------------------------------------------- > > Key: KAFKA-16620 > URL: https://issues.apache.org/jira/browse/KAFKA-16620 > Project: Kafka > Issue Type: Bug > Reporter: Gantigmaa Selenge > Assignee: Gantigmaa Selenge > Priority: Major > > Controller quorum cannot seem to form at all after accidentally restarting > all controller nodes at the same time in a test environment. This is > reproducible, happens almost everytime when restarting all controller nodes > of the cluster. > Started a cluster with 3 controller nodes and 3 broker nodes. After > restarting the controller nodes, one of them becomes the active controller > but resigns due to fetch timeout. The quorum leadership bounces off like this > between the nodes indefinitely. > The controller.quorum.fetch.timeout.ms was set to the default of 2 seconds. > Logs from an active controller: > {code:java} > 2024-04-17 14:00:48,250 INFO [QuorumController id=0] Becoming the active > controller at epoch 34, next write offset 1116. > (org.apache.kafka.controller.QuorumController) > [quorum-controller-0-event-handler] > 2024-04-17 14:00:48,250 WARN [QuorumController id=0] Performing controller > activation. Loaded ZK migration state of NONE. > (org.apache.kafka.controller.QuorumController) > [quorum-controller-0-event-handler] > 2024-04-17 14:00:48,701 INFO [RaftManager id=0] Node 1 disconnected. > (org.apache.kafka.clients.NetworkClient) > [kafka-0-raft-outbound-request-thread] > 2024-04-17 14:00:48,701 WARN [RaftManager id=0] Connection to node 1 > (my-cluster-controller-1.my-cluster-kafka-brokers.roller.svc.cluster.local/10.244.0.68:9090) > could not be established. Node may not be available. > (org.apache.kafka.clients.NetworkClient) > [kafka-0-raft-outbound-request-thread] > 2024-04-17 14:00:48,776 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log0] Flushing log up to offset 1117 > (exclusive)with recovery point 1117, last flushed: 1713362448239, current > time: 1713362448776,unflushed: 1 (kafka.log.UnifiedLog) > [kafka-0-raft-io-thread] > 2024-04-17 14:00:49,277 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log0] Flushing log up to offset 1118 > (exclusive)with recovery point 1118, last flushed: 1713362448777, current > time: > ... > 2024-04-17 14:01:35,934 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log0] Flushing log up to offset 1200 > (exclusive)with recovery point 1200, last flushed: 1713362489371, current > time: 1713362495934,unflushed: 1 (kafka.log.UnifiedLog) > [kafka-0-raft-io-thread] > 2024-04-17 14:01:36,121 INFO [RaftManager id=0] Did not receive fetch request > from the majority of the voters within 3000ms. Current fetched voters are []. > (org.apache.kafka.raft.LeaderState) [kafka-0-raft-io-thread] > 2024-04-17 14:01:36,223 WARN [QuorumController id=0] Renouncing the > leadership due to a metadata log event. We were the leader at epoch 34, but > in the new epoch 35, the leader is (none). Reverting to last stable offset > 1198. (org.apache.kafka.controller.QuorumController) > [quorum-controller-0-event-handler] > 2024-04-17 14:01:36,223 INFO [QuorumController id=0] > failAll(NotControllerException): failing writeNoOpRecord(152156824). > (org.apache.kafka.deferred.DeferredEventQueue) > [quorum-controller-0-event-handler] > 2024-04-17 14:01:36,223 INFO [QuorumController id=0] writeNoOpRecord: event > failed with NotControllerException in 6291037 microseconds. > (org.apache.kafka.controller.QuorumController) > [quorum-controller-0-event-handler]{code} > Logs from the follower: > {code:java} > 024-04-17 14:00:48,242 INFO [RaftManager id=2] Completed transition to > FollowerState(fetchTimeoutMs=2000, epoch=34, leaderId=0, voters=[0, 1, 2], > highWatermark=Optional[LogOffsetMetadata(offset=1113, > metadata=Optional.empty)], fetchingSnapshot=Optional.empty) from > Voted(epoch=34, votedId=0, voters=[0, 1, 2], electionTimeoutMs=1794) > (org.apache.kafka.raft.QuorumState) [kafka-2-raft-io-thread] > 2024-04-17 14:00:48,242 INFO [QuorumController id=2] In the new epoch 34, the > leader is 0. (org.apache.kafka.controller.QuorumController) > [quorum-controller-2-event-handler] > 2024-04-17 14:00:48,247 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1116 > (exclusive)with recovery point 1116, last flushed: 1713362442238, current > time: 1713362448247,unflushed: 2 (kafka.log.UnifiedLog) > [kafka-2-raft-io-thread] > 2024-04-17 14:00:48,777 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1117 > (exclusive)with recovery point 1117, last flushed: 1713362448249, current > time: 1713362448777,unflushed: 1 (kafka.log.UnifiedLog) > [kafka-2-raft-io-thread] > 2024-04-17 14:00:49,278 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1118 > (exclusive)with recovery point 1118, last flushed: 1713362448811, current > time: 1713362449278,unflushed > ... > 2024-04-17 14:01:29,371 DEBUG [UnifiedLog partition=__cluster_metadata-0, > dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1199 > (exclusive)with recovery point 1199, last flushed: 1713362488919, current > time: 1713362489371,unflushed: 1 (kafka.log.UnifiedLog) > [kafka-2-raft-io-thread] > 2024-04-17 14:01:35,718 INFO [RaftManager id=2] Become candidate due to fetch > timeout (org.apache.kafka.raft.KafkaRaftClient) [kafka-2-raft-io-thread]{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)