[
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086554#comment-15086554
]
Jason Gustafson commented on KAFKA-3068:
----------------------------------------
[~ijuma],[~enothereska] Introducing a notion of cluster identity seems like a
good idea to solve this problem more generally. I'm not sure I understand the
point about not having enough bootstrap brokers though. The cached metadata
always contains the full list of brokers in the cluster (i.e. those registered
in Zookeeper). The only time we might need to dip into an additional set is
when all of the known brokers simultaneously become unreachable. When that
happens, it seems to make more sense to revert to the list of bootstrap brokers
rather than attempting to reach a broker which we discovered previously but had
itself became unreachable at some point (if it was still reachable on the last
metadata refresh, then it would be contained in the metadata). And yes, the
bootstrap brokers can also move to another cluster, but I would consider this a
user configuration error rather than an application error. You can hardly fault
an application for trying to connect to the endpoints that had been explicitly
configured, but you might if it connects to a broker it previously discovered
which had been shutdown and moved to another cluster. Does that make sense?
> NetworkClient may connect to a different Kafka cluster than originally
> configured
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 0.9.0.0
> Reporter: Jun Rao
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all
> brokers (id and ip) that the client has ever seen. If we can't find an
> available broker from the current Metadata, we will pick a broker that we
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that
> we have a broker with id 1 in a Kafka cluster. A producer client remembers
> this broker in nodesEverSeen. At some point, we bring down this broker and
> use the host in a different Kafka cluster. Then, the producer client uses
> this broker from nodesEverSeen to refresh metadata. It will find the metadata
> in a different Kafka cluster and start producing data there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)