[
https://issues.apache.org/jira/browse/KAFKA-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310955#comment-14310955
]
Jay Kreps commented on KAFKA-1155:
----------------------------------
[~nehanarkhede] is this still a problem? Seems like a serious issue...
> Kafka server can miss zookeeper watches during long zkclient callbacks
> ----------------------------------------------------------------------
>
> Key: KAFKA-1155
> URL: https://issues.apache.org/jira/browse/KAFKA-1155
> Project: Kafka
> Issue Type: Bug
> Components: controller
> Affects Versions: 0.8.0, 0.8.1
> Reporter: Neha Narkhede
> Assignee: Neha Narkhede
> Priority: Critical
>
> On getting a zookeeper watch, zkclient invokes the blocking user callback and
> only re-registers the watch after the callback returns. This leaves a
> possibly large window of time when Kafka has not registered for watches on
> the desired zookeeper paths and hence can miss important state changes (on
> the controller). In any case, it is worth noting that even though zookeeper
> has a read-and-set-watch API, there can always be a window of time between
> the watch being fired, the callback and the read-and-set-watch API call. Due
> to the zkclient wrapper, it is difficult to handle this properly in the Kafka
> code unless we directly use the zookeeper client. One way of getting around
> this issue is to use timestamps on the paths and when a watch fires, check if
> the timestamp in zk is different from the one in the callback handler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)