[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

Jason Gustafson (JIRA) Wed, 06 Jan 2016 16:18:08 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086554#comment-15086554
 ]


Jason Gustafson commented on KAFKA-3068:
----------------------------------------

[~ijuma],[~enothereska] Introducing a notion of cluster identity seems like a 
good idea to solve this problem more generally. I'm not sure I understand the 
point about not having enough bootstrap brokers though. The cached metadata 
always contains the full list of brokers in the cluster (i.e. those registered 
in Zookeeper). The only time we might need to dip into an additional set is 
when all of the known brokers simultaneously become unreachable. When that 
happens, it seems to make more sense to revert to the list of bootstrap brokers 
rather than attempting to reach a broker which we discovered previously but had 
itself became unreachable at some point (if it was still reachable on the last 
metadata refresh, then it would be contained in the metadata). And yes, the 
bootstrap brokers can also move to another cluster, but I would consider this a 
user configuration error rather than an application error. You can hardly fault 
an application for trying to connect to the endpoints that had been explicitly 
configured, but you might if it connects to a broker it previously discovered 
which had been shutdown and moved to another cluster. Does that make sense?

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-3068
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3068
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.9.0.0
>            Reporter: Jun Rao
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

Reply via email to