Gantigmaa Selenge created KAFKA-16620:
-----------------------------------------

             Summary: Kraft quorum cannot be formed if all controllers are 
restarted at the same time
                 Key: KAFKA-16620
                 URL: https://issues.apache.org/jira/browse/KAFKA-16620
             Project: Kafka
          Issue Type: Bug
            Reporter: Gantigmaa Selenge


Controller quorum cannot seem to form at all after accidentally restarting all 
controller nodes at the same time in a test environment. This is reproducible, 
happens almost everytime when restarting all controller nodes of the cluster. 

Started a cluster with 3 controller nodes and 3 broker nodes. After restarting 
the controller nodes, one of them becomes the active controller but resigns due 
to fetch timeout. The quorum leadership bounces off like this between the nodes 
indefinitely. 
The controller.quorum.fetch.timeout.ms was set to the default of 2 seconds. 
Logs from an active controller:
```
2024-04-17 14:00:48,250 INFO [QuorumController id=0] Becoming the active 
controller at epoch 34, next write offset 1116. 
(org.apache.kafka.controller.QuorumController) 
[quorum-controller-0-event-handler]
2024-04-17 14:00:48,250 WARN [QuorumController id=0] Performing controller 
activation. Loaded ZK migration state of NONE. 
(org.apache.kafka.controller.QuorumController) 
[quorum-controller-0-event-handler]
2024-04-17 14:00:48,701 INFO [RaftManager id=0] Node 1 disconnected. 
(org.apache.kafka.clients.NetworkClient) [kafka-0-raft-outbound-request-thread]
2024-04-17 14:00:48,701 WARN [RaftManager id=0] Connection to node 1 
(my-cluster-controller-1.my-cluster-kafka-brokers.roller.svc.cluster.local/10.244.0.68:9090)
 could not be established. Node may not be available. 
(org.apache.kafka.clients.NetworkClient) [kafka-0-raft-outbound-request-thread]
2024-04-17 14:00:48,776 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log0] Flushing log up to offset 1117 
(exclusive)with recovery point 1117, last flushed: 1713362448239,  current 
time: 1713362448776,unflushed: 1 (kafka.log.UnifiedLog) [kafka-0-raft-io-thread]
2024-04-17 14:00:49,277 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log0] Flushing log up to offset 1118 
(exclusive)with recovery point 1118, last flushed: 1713362448777,  current 
time: 
...
2024-04-17 14:01:35,934 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log0] Flushing log up to offset 1200 
(exclusive)with recovery point 1200, last flushed: 1713362489371,  current 
time: 1713362495934,unflushed: 1 (kafka.log.UnifiedLog) [kafka-0-raft-io-thread]
2024-04-17 14:01:36,121 INFO [RaftManager id=0] Did not receive fetch request 
from the majority of the voters within 3000ms. Current fetched voters are []. 
(org.apache.kafka.raft.LeaderState) [kafka-0-raft-io-thread]
2024-04-17 14:01:36,223 WARN [QuorumController id=0] Renouncing the leadership 
due to a metadata log event. We were the leader at epoch 34, but in the new 
epoch 35, the leader is (none). Reverting to last stable offset 1198. 
(org.apache.kafka.controller.QuorumController) 
[quorum-controller-0-event-handler]
2024-04-17 14:01:36,223 INFO [QuorumController id=0] 
failAll(NotControllerException): failing writeNoOpRecord(152156824). 
(org.apache.kafka.deferred.DeferredEventQueue) 
[quorum-controller-0-event-handler]
2024-04-17 14:01:36,223 INFO [QuorumController id=0] writeNoOpRecord: event 
failed with NotControllerException in 6291037 microseconds. 
(org.apache.kafka.controller.QuorumController) 
[quorum-controller-0-event-handler]
```
Logs from the follower:
```
024-04-17 14:00:48,242 INFO [RaftManager id=2] Completed transition to 
FollowerState(fetchTimeoutMs=2000, epoch=34, leaderId=0, voters=[0, 1, 2], 
highWatermark=Optional[LogOffsetMetadata(offset=1113, 
metadata=Optional.empty)], fetchingSnapshot=Optional.empty) from 
Voted(epoch=34, votedId=0, voters=[0, 1, 2], electionTimeoutMs=1794) 
(org.apache.kafka.raft.QuorumState) [kafka-2-raft-io-thread]
2024-04-17 14:00:48,242 INFO [QuorumController id=2] In the new epoch 34, the 
leader is 0. (org.apache.kafka.controller.QuorumController) 
[quorum-controller-2-event-handler]
2024-04-17 14:00:48,247 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1116 
(exclusive)with recovery point 1116, last flushed: 1713362442238,  current 
time: 1713362448247,unflushed: 2 (kafka.log.UnifiedLog) [kafka-2-raft-io-thread]
2024-04-17 14:00:48,777 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1117 
(exclusive)with recovery point 1117, last flushed: 1713362448249,  current 
time: 1713362448777,unflushed: 1 (kafka.log.UnifiedLog) [kafka-2-raft-io-thread]
2024-04-17 14:00:49,278 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1118 
(exclusive)with recovery point 1118, last flushed: 1713362448811,  current 
time: 1713362449278,unflushed
...
2024-04-17 14:01:29,371 DEBUG [UnifiedLog partition=__cluster_metadata-0, 
dir=/var/lib/kafka/data/kafka-log2] Flushing log up to offset 1199 
(exclusive)with recovery point 1199, last flushed: 1713362488919,  current 
time: 1713362489371,unflushed: 1 (kafka.log.UnifiedLog) [kafka-2-raft-io-thread]
2024-04-17 14:01:35,718 INFO [RaftManager id=2] Become candidate due to fetch 
timeout (org.apache.kafka.raft.KafkaRaftClient) [kafka-2-raft-io-thread]
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to