[
https://issues.apache.org/jira/browse/ZOOKEEPER-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mutu updated ZOOKEEPER-4816:
----------------------------
Summary: A follower can not join the cluster for 20s seconds (was: A
follower can not join the cluster for 30s seconds)
> A follower can not join the cluster for 20s seconds
> ---------------------------------------------------
>
> Key: ZOOKEEPER-4816
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4816
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.10.0
> Reporter: mutu
> Priority: Critical
> Attachments: node1.log, node2.log, node3.log
>
>
> We encounter a strange scenario. When we set up the cluster of zookeeper(3
> nodes totally), the third node is stuck in serializing the snapshot to the
> local disk. However, the leader election is executed normally. After the
> election, the third node is elected as the leader. The other two nodes fail
> to connect with the leader. Hence, the first and second nodes restart the
> leader election, finally the second node is elected as the leader. At this
> time, the third node still act as the leader. There are two leaders in the
> cluster. The first node can not join the cluster for 30s. During this
> procedure, the client can not connect with any nodes of the cluster.
> Runtime logs are attached.
> Are there any comments to figure out this issues? I will very appreciate them.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)