shanthoosh edited a comment on issue #1137: SAMZA-2301: Improve zookeeper metadata-store implementation URL: https://github.com/apache/samza/pull/1137#issuecomment-521446465 > If it is latter, when the new server catches up, it should trigger another data change notification which will be eventually picked up by the change listener. Ignoring the data-change event by simple logging alone will result in data inconsistency issues and miss zookeeper events. Consider the following sequence: 1. Follower(P1) receives the JobModelVersion change notification from server S1 in ensemble which had the majority quorum write. 2. Before Follower(P1) reads the JobModel, it receives a disconnect from zookeeper server S1 and successfully reconnects to zookeeper server S2. 3. On re-connection, the follower(P1) will re-create all the data-node and child-node watches. 4. If the watch is created by P1 after the JobModel version change write propogration had happened to zookeeper-server S2 and not the JobModel write, then the notification will not be propogated to the follower. So logging alone wouldn't solve the problem. I preferred the retry since it can happen for all zk-writes. Alternatively, we can read the `JobModel` for the agreed barrier version always before starting the SamzaContainer(and remove retry).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
