[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Lou updated ZOOKEEPER-4837:
---------------------------------
    Priority: Critical  (was: Major)

> Network issue causes ephemeral node unremoved after the session expiration
> --------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4837
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4837
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum, server
>            Reporter: Dimas Shidqi Parikesit
>            Priority: Critical
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In our testing cluster with the latest ZooKeeper version (66202cb), we 
> observed that sometimes an ephemeral node never gets deleted if there is a 
> network issue during the PROPOSAL request, even after the session expires. 
> This bug is essentially related to ZOOKEEPER-2355, but the issue was not 
> entirely fixed in the previous patch. We also tested on some related open PRs 
> (e.g., [https://github.com/apache/zookeeper/pull/2152] and 
> [https://github.com/apache/zookeeper/pull/1925] ), and confirmed the issue 
> exists after the proposed fix.
>  
> Steps to reproduce this bug:
>  # Start a cluster with 3 servers, follower A, leader B, follower C
>  # Open a zk client in server A
>  # Create an ephemeral node in the client
>  # Inject network issue in all server that causes SocketTimeoutException 
> during readPacket if the packet is a PROPOSAL
>  # Close the client
>  # Wait until the cluster is stable (the leader will change between B and C 
> several times)
>  # Remove the network issue from all server
>  # Check every server for ephemeral node existence. The ephemeral node will 
> exist in server A. However, server B and C will not have the ephemeral node 
> anymore.
>  
> Essentially the bug is caused by loadDatabase loading a snapshot with a 
> higher Zxid than the truncated log, causing fastForwardFromEdits to fail when 
> trying to replay the transactions. For example, if one of the follower has a 
> lastProcessedZxid of 0x200000001 and last snapshot snapshot.200000001, and 
> the leader sends a TRUNC with a zxid of 100000002, truncateLog will truncate 
> the follower's log to 100000002. However, loadDatabase will load 
> snapshot.200000001. So when fastForwardFromEdits happens, it will set the 
> data tree to 200000001 instead of 100000002.
>  
> We also attached a test case to reproduce this issue. Note that this test 
> case is still pretty flaky at this moment.
>  
> We proposed to fix this case by loading the database from the last snapshot 
> that happens before the last truncated-log entry during truncateLog. See our 
> PR attached. Of course, this may not be the ideal solution and we welcome 
> suggestions. Some other potential solutions include: 
> (1) Disable fastForwardDatabase in shutdown
> (2) Run setLastProcessedZxid at the end of Learner's syncWithLeader function 
> if the packet is Leader.DIFF 
>  
> Your insights are very much appreciated. We will continue following up this 
> issue until it is resolved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to