Recently, we found the serious ZkClient bug, actual Apache Zookeeper client
bug, which can bring down broker/consumer on zookeeper push.

We're running kafka and zookeeeper in AWS EC2 environment. Zookeeper
instances are bound with EIP to give the static hostname for each instance,
which means even if the EC2 instance is terminated and replaced with the
new one, it will have the same hostname but its private IP bound to the
hostname can be changed.

The scenario is, if we do rolling push all zookeeper server instances by
terminating and waiting until the new instance joins to the quorum one by
one, finally, ZkClient will try to connect to the old IP addresses which do
not exist any more due to DNS caching on Apache Zookeeper client side,
please refer to https://issues.apache.org/jira/browse/ZOOKEEPER-338

So, we need to restart kafka brokers and consumers to refresh DNS cache. To
solve this problem, I sent the following pull request to ZkClient,
https://github.com/sgroschupf/zkclient/pull/26

Please review the above PR. If new version of ZkClient with the following
fix is not released on the schedule of kafka 0.8.2 release, I'd like kafka
to ship the internally built ZkClient with the fix. I will really
appreciate.

Thank you
Best, Jae

Reply via email to