[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695436#comment-16695436
 ] 

Michael K. Edwards commented on ZOOKEEPER-3145:
-----------------------------------------------

Fix needed for 3.5.5?

> Potential watch missing issue due to stale pzxid when replaying CloseSession 
> txn with fuzzy snapshot
> ----------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3145
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3145
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.4, 3.6.0, 3.4.13
>            Reporter: Fangmin Lv
>            Assignee: Fangmin Lv
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.6.0
>
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> This is another issue I found recently, we haven't seen this problem on prod 
> (or maybe we don't notice).
>  
> Currently, the CloseSession is not idempotent, executing the CloseSession 
> twice won't get the same result.
>  
> The problem is that closeSession will only check what's the ephemeral nodes 
> associated with that session bases on current states. Nodes deleted during 
> taking fuzzy snapshot won't be deleted again when replay the txn.
>  
> This looks fine, since it's already gone, but there is problem with the pzxid 
> of the parent node. Snapshot is taken fuzzily, so it's possible that the 
> parent had been serialized while the nodes are being deleted when executing 
> the closeSession Txn. The pzxid will not be updated in the snapshot when 
> replaying the closeSession txn, because doesn't know what's the paths being 
> deleted, so it won't patch the pzxid like what we did in the deleteNode 
> ZOOKEEPER-3125.
>  
> The inconsistent pzxid will lead to potential watch notification missing when 
> client reconnect with setWatches because of the staleness. 
>  
> This JIRA is going to fix those issues by adding the CloseSessionTxn, it will 
> record all those nodes being deleted in that CloseSession txn, so that we 
> know which nodes to update when replaying the txn.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to