[
https://issues.apache.org/jira/browse/ZOOKEEPER-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695436#comment-16695436
]
Michael K. Edwards commented on ZOOKEEPER-3145:
-----------------------------------------------
Fix needed for 3.5.5?
> Potential watch missing issue due to stale pzxid when replaying CloseSession
> txn with fuzzy snapshot
> ----------------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3145
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3145
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.4, 3.6.0, 3.4.13
> Reporter: Fangmin Lv
> Assignee: Fangmin Lv
> Priority: Critical
> Labels: pull-request-available
> Fix For: 3.6.0
>
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> This is another issue I found recently, we haven't seen this problem on prod
> (or maybe we don't notice).
>
> Currently, the CloseSession is not idempotent, executing the CloseSession
> twice won't get the same result.
>
> The problem is that closeSession will only check what's the ephemeral nodes
> associated with that session bases on current states. Nodes deleted during
> taking fuzzy snapshot won't be deleted again when replay the txn.
>
> This looks fine, since it's already gone, but there is problem with the pzxid
> of the parent node. Snapshot is taken fuzzily, so it's possible that the
> parent had been serialized while the nodes are being deleted when executing
> the closeSession Txn. The pzxid will not be updated in the snapshot when
> replaying the closeSession txn, because doesn't know what's the paths being
> deleted, so it won't patch the pzxid like what we did in the deleteNode
> ZOOKEEPER-3125.
>
> The inconsistent pzxid will lead to potential watch notification missing when
> client reconnect with setWatches because of the staleness.
>
> This JIRA is going to fix those issues by adding the CloseSessionTxn, it will
> record all those nodes being deleted in that CloseSession txn, so that we
> know which nodes to update when replaying the txn.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)