Kezhu Wang created ZOOKEEPER-4698:
-------------------------------------

             Summary: Persistent watch events lost after reconnection
                 Key: ZOOKEEPER-4698
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4698
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.8.1, 3.7.1
            Reporter: Kezhu Wang


I found this in reply to [apache#1950 
(comment)|https://github.com/apache/zookeeper/pull/1950#issuecomment-1553742525].
 But it turns out a known issue [apache#1106 
(comment)|https://github.com/apache/zookeeper/pull/1106#issuecomment-543860329].

I think it is worth to note separately in jira for potential future discussions 
and fix. I have pushed a [test 
case|https://github.com/kezhuw/zookeeper/commit/31d89e9829380559066fc2b83e3d38462380c5d4]
 for this. It fails as expected.

{noformat}
[ERROR] Failures: 
[ERROR]   WatchEventWhenAutoResetTest.testPersistentRecursiveWatch:237 do not 
receive a NodeDataChanged ==> expected: not <null>
[ERROR]   WatchEventWhenAutoResetTest.testPersistentWatch:211 do not receive a 
NodeDataChanged ==> expected: not <null>
{noformat}

It is hard to fix this with sole {{DataTree}}. Two independent comments 
[pointed|https://github.com/apache/zookeeper/pull/1106#issuecomment-1366449561] 
[out|https://github.com/kezhuw/zookeeper/commit/31d89e9829380559066fc2b83e3d38462380c5d4#diff-cfd09b7021c88da6631872e8a4a271f830162f7c5a63a140839ba029048493fdR227-R230]
 this. I guess we have to walk through txn log to deliver a correct fix. 

{quote}
Watches will not be received while disconnected from a server. When a client 
reconnects, any previously registered watches will be reregistered and 
triggered if needed. In general this all occurs transparently. There is one 
case where a watch may be missed: a watch for the existence of a znode not yet 
created will be missed if the znode is created and deleted while disconnected.
{quote}

This is [what our programer's guide 
says|https://zookeeper.apache.org/doc/r3.8.1/zookeeperProgrammers.html#ch_zkWatches].
 It is well-known, at least for me, that we can lose some transiently 
intermediate events in reconnection. But in case of persistent watch, we can 
lose more. This forces clients to rebuild their knowledge on reconnection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to