[
https://issues.apache.org/jira/browse/KAFKA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232645#comment-15232645
]
Jun Rao commented on KAFKA-3042:
--------------------------------
[~delbaeth], [~wushujames], a few things.
1. Supposedly after step 5), controller 3 will send the latest ZK version for
the ISR path to broker 2 through LeaderAndIsrRequest. That should stop the
warning on "Cached zkVersion...". It seems somehow that didn't happen. Could
you send the state-change log in broker 2 around that time? You want to include
probably the log 5 mins before and 5 mins after the very first "Cached
zkVersion...". Could you also do that for the controller log in controller 1
and controller 3?
2. The controller log shows that controller 3 stopped at 01:05:23. Is broker 3
still up at that time?
3. We have discovered a few issues due to ZK session expiration, not all of
which have been fixed. So, in the short term, it would be good to avoid ZK
session expiration in the first place. You mentioned this may be due to a
network issue? How long did the network issue last? Another common cause of ZK
session expiration is broker GC. Do you have the GC log on the session expired
brokers?
> updateIsr should stop after failed several times due to zkVersion issue
> -----------------------------------------------------------------------
>
> Key: KAFKA-3042
> URL: https://issues.apache.org/jira/browse/KAFKA-3042
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8.2.1
> Environment: jdk 1.7
> centos 6.4
> Reporter: Jiahongchao
> Attachments: controller.log, server.log.2016-03-23-01,
> state-change.log
>
>
> sometimes one broker may repeatly log
> "Cached zkVersion 54 not equal to that in zookeeper, skip updating ISR"
> I think this is because the broker consider itself as the leader in fact it's
> a follower.
> So after several failed tries, it need to find out who is the leader
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)