[
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088200#comment-15088200
]
Ewen Cheslack-Postava commented on KAFKA-3068:
----------------------------------------------
[~junrao] It's definitely a real issue -- see
https://issues.apache.org/jira/browse/KAFKA-1843?focusedCommentId=14517659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14517659
where you can get into unrecoverable situations if a Kafka node is
decomissioned or in the case of a failure. This doesn't necessarily seem like
it would be that rare to me -- sure, it requires there to be failures, but any
app that produces to a single topic could encounter a case like this since
it'll have a very limited set of brokers to work with. Multiple people reported
encountering this issue, although as far as I've seen it may only have been in
testing environments:
http://mail-archives.apache.org/mod_mbox/kafka-users/201505.mbox/%3ccaj7fv7qbfqzrd0eodjmtrq725q+ziirnctwjhyhqldm+5cw...@mail.gmail.com%3E
It sounds like KAFKA-1843 is messy enough that it needs a more complete
discussion to resolve and maybe a KIP. Just waiting on the old list doesn't
help in the situation described above if there's an actual hardware failure
that takes the node out entirely -- eventually you need to try something else.
If we're going to change anything now, I'd suggest either making the entries in
nodesEverSeen expire but at a longer interval than the metadata refresh or
falling back to bootstrap nodes. I think the former is probably the better
solution, though the latter may be easier for people to reason about until we
work out a more complete solution. The former seems nicer because it decouples
the expiration from other failures. Currently a broker failure that triggers a
metadata refresh can force that broker out of the set of nodes to try
immediately, which is why you can get into situations like the one linked above
(and in much less time than the metadata refresh interval since these events
force a metadata refresh).
Ultimately, the problem boils down to the fact that we're not getting the
information we want from these metadata updates -- we get the info we need
about topic leaders, but we're also using this metadata for basic connectivity
to the cluster. That breaks down if you're only working with one or a small
number of topics and they end up with too few or crashed replicas. I think
caching old entries is not the ideal solution here, but it's a way to work
around the current situation. It would probably be better if metadata responses
could include a random subset of the current set of brokers so that rather than
relying on the original bootstrap servers we could get new, definitely valid
bootstrap brokers with each metadata refresh. (It might even be possible to
implement this just by having the broker return extra brokers in the metadata
response even if there aren't entries in the topic metadata section of the
response referencing those brokers; but I'm not sure if that'd break anything,
either in our clients or in third party clients. It also doesn't fully fix the
problem since the linked issue can still occur, but should only ever happen in
the case of their tiny test case, not in practice.)
> NetworkClient may connect to a different Kafka cluster than originally
> configured
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 0.9.0.0
> Reporter: Jun Rao
> Assignee: Eno Thereska
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all
> brokers (id and ip) that the client has ever seen. If we can't find an
> available broker from the current Metadata, we will pick a broker that we
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that
> we have a broker with id 1 in a Kafka cluster. A producer client remembers
> this broker in nodesEverSeen. At some point, we bring down this broker and
> use the host in a different Kafka cluster. Then, the producer client uses
> this broker from nodesEverSeen to refresh metadata. It will find the metadata
> in a different Kafka cluster and start producing data there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)