Matthieu Nantern created KAFKA-20025:
----------------------------------------

             Summary: Missing dynamic SSL reconfiguration support for 
KafkaRaftClient
                 Key: KAFKA-20025
                 URL: https://issues.apache.org/jira/browse/KAFKA-20025
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 4.1.1, 3.9.1
            Reporter: Matthieu Nantern


Hi,

I'd like to discuss a gap in the dynamic SSL reconfiguration support for KRaft 
mode that affects brokers connecting to the controller quorum.

In KRaft mode, when SSL certificates are renewed and dynamically reloaded via 
{{{}kafka-configs.sh{}}}, the KafkaRaftClient (used by brokers to fetch cluster 
metadata from controllers as "observers" per KIP-853) does not pick up the new 
certificates.

This causes SSL handshake failures with {{CertificateExpiredException}} errors, 
even though the reload command reports success.

 

Error observed on broker:{{    
org.apache.kafka.common.errors.SslAuthenticationException: Failed to}}
{{process post-handshake messages}}
{{    Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert:}}
{{certificate_unknown}}

Error observed on controller:
{{    Caused by: java.security.cert.CertificateExpiredException: NotAfter:}}
{{Wed Dec 17 08:28:22 UTC 2025}}

The SslChannelBuilder implements ListenerReconfigurable and supports
dynamic SSL reconfiguration. However, in KafkaRaftManager, the channel builder 
is never registered with config.addReconfigurable().

In contrast, {{NodeToControllerChannelManager}} (and other components) 
correctly registers the channel builder:
NodeToControllerChannelManager.scala (trunk, lines 130-132):

{{    channelBuilder match {}}
{{      case reconfigurable: Reconfigurable =>}}
{{config.addReconfigurable(reconfigurable)}}
{{      case _ =>}}
{{    }}}

I checked that the issue exists in both Kafka 3.9.1 and current trunk (as
of 2025-12-17).

The only reliable workaround I found is to restart the Kafka broker when
certificates are renewed. Has anyone else encountered this, or is there a 
reason this was intentionally left out?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to