[ https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112787#comment-15112787 ]
Ewen Cheslack-Postava commented on KAFKA-3068: ---------------------------------------------- I'm thinking more about this in combination with KAFKA-3112. In 3112, you may get unresolvable nodes over time if your app is running for a long time and your bootstrap servers become stale. The more I think about this, most of these issues seem to ultimately be due to the fact that you have to specify a fixed list of bootstrap servers which should be valid indefinitely. Assuming you have the right setup, you can work around this by, e.g., a VIP/loadbalancer/round robin DNS. But I'm wondering for how many people this requires extra work because they are already using some other service discovery mechanism? Today, if you want to pull bootstrap data from somewhere else, you need to do it at app startup and commit to using that fixed set of servers. I'm hesitant to suggest adding more points for extension, but wouldn't this be addressed more generally if there was a hook to get a list of bootstrap servers and we invoked it at startup, any time we run out of options, or fail to connect for too long? The default can just be to use the bootstrap servers list, but if you want to grab your list of servers from ZK, DNS, Consul, or whatever your system of choice is, you could easily hook those in instead and avoid this entire problem. > NetworkClient may connect to a different Kafka cluster than originally > configured > --------------------------------------------------------------------------------- > > Key: KAFKA-3068 > URL: https://issues.apache.org/jira/browse/KAFKA-3068 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 0.9.0.0 > Reporter: Jun Rao > Assignee: Eno Thereska > > In https://github.com/apache/kafka/pull/290, we added the logic to cache all > brokers (id and ip) that the client has ever seen. If we can't find an > available broker from the current Metadata, we will pick a broker that we > have ever seen (in NetworkClient.leastLoadedNode()). > One potential problem this logic can introduce is the following. Suppose that > we have a broker with id 1 in a Kafka cluster. A producer client remembers > this broker in nodesEverSeen. At some point, we bring down this broker and > use the host in a different Kafka cluster. Then, the producer client uses > this broker from nodesEverSeen to refresh metadata. It will find the metadata > in a different Kafka cluster and start producing data there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)