[jira] [Commented] (ZOOKEEPER-4040) java.io.IOException: Leaders epoch, 1 is less than accepted epoch, 2

Damien Diederen (Jira) Tue, 29 Dec 2020 04:04:05 -0800


    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255959#comment-17255959
 ]


Damien Diederen commented on ZOOKEEPER-4040:
--------------------------------------------

Hi [~pf],

As far as I can tell, this is a translation of ZOOKEEPER-4039, which indeed is 
a bit more understandable from my point of view :)  Should I just copy the 
automated translation below (courtesy of Google) in the description of this 
ticket, and close ZOOKEEPER-4039 as "duplicate," or is there something else I 
am missing?

{quote}
The accpetedEpoch is too large and the corresponding node cannot join the 
cluster

After the leader receives the accpetedEpoch of more than half of the nodes, it 
will set its accpetedEpoch to the maximum value of these nodes plus 1, but at 
this time, the leader’s downtime will cause the leader node’s accpetedEpoch to 
be 1 larger than other nodes, and then this node will restart again Be elected 
as the leader, go down again, and then the remaining nodes re-elect a leader. 
The epoch of this leader will be smaller than the accpetedEpoch of the original 
leader, which causes the original node to always look and switch the follower 
state

Steps to reproduce:

3 nodes, server1, server2, server3

Start server1, server2, and then stop server1 and server2 at the red dot below. 
At this time, the corresponding accpetedEpoch=1 of server2

Restart server1, server2, and then stop server1 and server2 at the red dot 
below. At this time, the corresponding accpetedEpoch=2 of server2

Restart server1, server3, wait for server1 and server3 to elect the 
corresponding leader as server3, and then start server2, the following 
exception will be repeated
{quote}

 

> java.io.IOException: Leaders epoch, 1 is less than accepted epoch, 2
> --------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4040
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4040
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.5
>            Reporter: pengfei
>            Priority: Major
>         Attachments: image-2020-12-28-18-20-07-842.png, 
> image-2020-12-28-18-23-14-073.png, image-2020-12-28-18-25-31-960.png, 
> image-2020-12-28-18-28-07-015.png
>
>
> h4. errorlog:
> java.io.IOException: Leaders epoch, 1 is less than accepted epoch, 
> 2java.io.IOException: Leaders epoch, 1 is less than accepted epoch, 2 at 
> org.apache.zookeeper.server.quorum.Learner.registerWithLeader(Learner.java:353)
>  at 
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:78) at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)2020-12-28
>  18:09:25,176 [myid:2] - INFO  
> [QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2182)(secure=disabled):Follower@201]
>  - shutdown calledjava.lang.Exception: shutdown Follower at 
> org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:201) at 
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1275)
>  
> h4. sample:
> cluster all servers server1,server2,server3
>  * start server1 and server2 ,then shutdown them when they arrive below, now 
> the accpetedEpoch of server2 is 1 , server1 is 0, server3 is 0 
> !image-2020-12-28-18-23-14-073.png!
>  * then repeat step 1 , now the accpetedEpoch of server1 is 0,server2 is 
> 2,server3 is 0 !image-2020-12-28-18-25-31-960.png!
>  * then start server1 and server3 , wait unti the leader of the cluster is 
> server3 , start server2 ,now generate the error below 
> !image-2020-12-28-18-28-07-015.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ZOOKEEPER-4040) java.io.IOException: Leaders epoch, 1 is less than accepted epoch, 2

Reply via email to