[
https://issues.apache.org/jira/browse/KAFKA-18875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
TaiJuWu resolved KAFKA-18875.
-----------------------------
Resolution: Duplicate
> KRaft controller does not retry registration if the first attempt times out
> ---------------------------------------------------------------------------
>
> Key: KAFKA-18875
> URL: https://issues.apache.org/jira/browse/KAFKA-18875
> Project: Kafka
> Issue Type: Bug
> Reporter: Daniel Fonai
> Priority: Minor
>
> There is a [retry
> mechanism|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerRegistrationManager.scala#L274]
> with exponential backoff built-in in KRaft controller registration. The
> timeout of the first attempt is 5 s for KRaft controllers
> ([code|https://github.com/apache/kafka/blob/3.9.0/core/src/main/scala/kafka/server/ControllerServer.scala#L448])
> which is not configurable.
> If for some reason the controller's first registration request times out, the
> attempt should be retried but in practice this does not happen and the
> controller is not able to join the quorum. We see the following in the faulty
> controller's log:
> {noformat}
> 2025-02-21 13:31:46,606 INFO [ControllerRegistrationManager id=3
> incarnation=mEzjHheAQ_eDWejAFquGiw] sendControllerRegistration: attempting to
> send ControllerRegistrationRequestData(controllerId=3,
> incarnationId=mEzjHheAQ_eDWejAFquGiw, zkMigrationReady=true,
> listeners=[Listener(name='CONTROLPLANE-9090',
> host='kraft-rollback-kafka-controller-pool-3.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-631e64ac.svc',
> port=9090, securityProtocol=1)], features=[Feature(name='kraft.version',
> minSupportedVersion=0, maxSupportedVersion=1),
> Feature(name='metadata.version', minSupportedVersion=1,
> maxSupportedVersion=21)]) (kafka.server.ControllerRegistrationManager)
> [controller-3-registration-manager-event-handler]
> ...
> 2025-02-21 13:31:51,627 ERROR [ControllerRegistrationManager id=3
> incarnation=mEzjHheAQ_eDWejAFquGiw] RegistrationResponseHandler: channel
> manager timed out before sending the request.
> (kafka.server.ControllerRegistrationManager)
> [controller-3-to-controller-registration-channel-manager]
> 2025-02-21 13:31:51,726 INFO [ControllerRegistrationManager id=3
> incarnation=mEzjHheAQ_eDWejAFquGiw] maybeSendControllerRegistration: waiting
> for the previous RPC to complete.
> (kafka.server.ControllerRegistrationManager)
> [controller-3-registration-manager-event-handler]
> {noformat}
> After this we can not see any controller retry in the log.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)