[
https://issues.apache.org/jira/browse/ZOOKEEPER-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chang Lou updated ZOOKEEPER-4837:
---------------------------------
Priority: Critical (was: Major)
> Network issue causes ephemeral node unremoved after the session expiration
> --------------------------------------------------------------------------
>
> Key: ZOOKEEPER-4837
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4837
> Project: ZooKeeper
> Issue Type: Bug
> Components: quorum, server
> Reporter: Dimas Shidqi Parikesit
> Priority: Critical
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In our testing cluster with the latest ZooKeeper version (66202cb), we
> observed that sometimes an ephemeral node never gets deleted if there is a
> network issue during the PROPOSAL request, even after the session expires.
> This bug is essentially related to ZOOKEEPER-2355, but the issue was not
> entirely fixed in the previous patch. We also tested on some related open PRs
> (e.g., [https://github.com/apache/zookeeper/pull/2152] and
> [https://github.com/apache/zookeeper/pull/1925] ), and confirmed the issue
> exists after the proposed fix.
>
> Steps to reproduce this bug:
> # Start a cluster with 3 servers, follower A, leader B, follower C
> # Open a zk client in server A
> # Create an ephemeral node in the client
> # Inject network issue in all server that causes SocketTimeoutException
> during readPacket if the packet is a PROPOSAL
> # Close the client
> # Wait until the cluster is stable (the leader will change between B and C
> several times)
> # Remove the network issue from all server
> # Check every server for ephemeral node existence. The ephemeral node will
> exist in server A. However, server B and C will not have the ephemeral node
> anymore.
>
> Essentially the bug is caused by loadDatabase loading a snapshot with a
> higher Zxid than the truncated log, causing fastForwardFromEdits to fail when
> trying to replay the transactions. For example, if one of the follower has a
> lastProcessedZxid of 0x200000001 and last snapshot snapshot.200000001, and
> the leader sends a TRUNC with a zxid of 100000002, truncateLog will truncate
> the follower's log to 100000002. However, loadDatabase will load
> snapshot.200000001. So when fastForwardFromEdits happens, it will set the
> data tree to 200000001 instead of 100000002.
>
> We also attached a test case to reproduce this issue. Note that this test
> case is still pretty flaky at this moment.
>
> We proposed to fix this case by loading the database from the last snapshot
> that happens before the last truncated-log entry during truncateLog. See our
> PR attached. Of course, this may not be the ideal solution and we welcome
> suggestions. Some other potential solutions include:
> (1) Disable fastForwardDatabase in shutdown
> (2) Run setLastProcessedZxid at the end of Learner's syncWithLeader function
> if the packet is Leader.DIFF
>
> Your insights are very much appreciated. We will continue following up this
> issue until it is resolved.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)