[
https://issues.apache.org/jira/browse/CURATOR-653?focusedWorklogId=815197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-815197
]
ASF GitHub Bot logged work on CURATOR-653:
------------------------------------------
Author: ASF GitHub Bot
Created on: 10/Oct/22 13:01
Start Date: 10/Oct/22 13:01
Worklog Time Spent: 10m
Work Description: XComp commented on PR #398:
URL: https://github.com/apache/curator/pull/398#issuecomment-1273281727
> @XComp Thank you very much for the comments. Could you send a pull request
based on the current patch? I don't know whether I can directly merge on the
forked repository but at least I can submit the patch when you made it :)
I created PR #436 but couldn't base it onto this PR. I handled my review
comments individually to make it easier to select relevant changes. The last
commit is a bigger refactoring of the test. You might want to leave that out if
you think that it's too much of a change.
Issue Time Tracking
-------------------
Worklog Id: (was: 815197)
Time Spent: 1h 10m (was: 1h)
> Double leader for LeaderLatch
> -----------------------------
>
> Key: CURATOR-653
> URL: https://issues.apache.org/jira/browse/CURATOR-653
> Project: Apache Curator
> Issue Type: Task
> Components: Recipes
> Reporter: Zili Chen
> Assignee: Zili Chen
> Priority: Major
> Fix For: 5.4.0
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Reported by @woaishixiaoxiao:
> When I use the LeaderLatch to select leader, there is a double-leader
> phenomenon.
> The timeline is as follows:
> 1. The zk cluster switch leader node bescause of zxid overflow. The cluster
> is unavailable to the outside world
> 2. A client(not leader befor zxid overflow) and B client(is leader before
> zxid overflow) enter the suspend state, B client set its leader status to
> false
> 3. The zk cluster complete the leader node election and the cluster back to
> normal
> 4. A client enter the reconnect state and call the reset function, set its
> leader status to false.
> 5. B client enter the reconnect state, call the reset function. set its
> leader status to false. Delete its old path.
> 6. A client receive preNodeDeleteEvent. Then getChildren from zkServer.
> Find itself is the smallest number and set itself as a leader.
> 7. B client create a new temporary node and then getChildren from zkServer.
> Find itself not the node with the smallest serial number and listen to the
> previous node delete event.
> 8. A client delete its old path.
> 9. B client receive the preNodeDeleteEvent. then getchildren from zkServer.
> Find itself is the smallest sequence number and then set itself as a leader
> 10. A client create a new temporary node and then getChildren from
> zkServer. Find itself not the node with the smallest serial number and
> listen to the previous node delete event. but it doesn't set itself as a
> non-leader state. because of the sixth step operation, A still is leader
> state now.
> 11. now A client and B client are the leader at the same time
--
This message was sent by Atlassian Jira
(v8.20.10#820010)