[ 
https://issues.apache.org/jira/browse/KAFKA-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310955#comment-14310955
 ] 

Jay Kreps commented on KAFKA-1155:
----------------------------------

[~nehanarkhede] is this still a problem? Seems like a serious issue...

> Kafka server can miss zookeeper watches during long zkclient callbacks
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-1155
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1155
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.0, 0.8.1
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Critical
>
> On getting a zookeeper watch, zkclient invokes the blocking user callback and 
> only re-registers the watch after the callback returns. This leaves a 
> possibly large window of time when Kafka has not registered for watches on 
> the desired zookeeper paths and hence can miss important state changes (on 
> the controller). In any case, it is worth noting that even though zookeeper 
> has a read-and-set-watch API, there can always be a window of time between 
> the watch being fired, the callback and the read-and-set-watch API call. Due 
> to the zkclient wrapper, it is difficult to handle this properly in the Kafka 
> code unless we directly use the zookeeper client. One way of getting around 
> this issue is to use timestamps on the paths and when a watch fires, check if 
> the timestamp in zk is different from the one in the callback handler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to