[ https://issues.apache.org/jira/browse/KAFKA-15828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias J. Sax updated KAFKA-15828: ------------------------------------ Component/s: clients consumer producer > Protect clients from broker hostname reuse > ------------------------------------------ > > Key: KAFKA-15828 > URL: https://issues.apache.org/jira/browse/KAFKA-15828 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, producer > Reporter: Jason Gustafson > Priority: Major > > In some environments such as k8s, brokers may be assigned to nodes > dynamically from an available pool. When a cluster is rolling, it is possible > for the client to see the same node advertised for different broker IDs in a > short period of time. For example, kafka-1 might be initially assigned to > node1. Before the client is able to establish a connection, it could be that > kafka-3 is now on node1 instead. Currently there is no protection in the > client or in the protocol for this scenario. If the connection succeeds, the > client will assume it has a good connection to kafka-1. Until something > disrupts the connection, it will continue under this assumption even if the > hostname for kafka-1 changes. > We have observed this scenario in practice. The client connected to the wrong > broker through stale hostname information. It was unable to produce data > because of persistent NOT_LEADER errors. The only way to recover in the end > was by restarting the client to force a reconnection. > We have discussed a couple potential solutions to this problem: > # Let the client be smarter managing the connection/hostname mapping. When > it detects that a hostname has changed, it should force a disconnect to > ensure it connects to the right node. > # We can modify the protocol to verify that the client has connected to the > intended broker. For example, we can add a field to ApiVersions to indicate > the intended broker ID. The broker receiving the request can return an error > if its ID does not match that in the request. > Are there alternatives? > > -- This message was sent by Atlassian Jira (v8.20.10#820010)