[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13807102#comment-13807102
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1732:
---------------------------------------------------

[~fpj], [~abranzyck]: did you guys test this patch when joining a cluster of 
servers running without this patch (i.e.: trunk, only without this patch)?

After rolling the first 2 followers - in a 5 member ensemble - the 3rd follower 
fails to join with this:

{noformat}
2013-10-28 18:43:18,134 - INFO  [WorkerReceiver[myid=4]] - Notification: 4 
(n.leader), 0x8900000415 (n.zxid), 0x6 (n.round), LOOKING (n.state), 4 (n.sid), 
0x89 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,134 - INFO  [WorkerReceiver[myid=4]] - Notification: 2 
(n.leader), 0x880000002c (n.zxid), 0xffffffffffffffff (n.round), FOLLOWING 
(n.state), 0 (n.sid), 0x89 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,135 - INFO  [WorkerReceiver[myid=4]] - Notification: 2 
(n.leader), 0x880000002c (n.zxid), 0x6 (n.round), LEADING (n.state), 2 (n.sid), 
0x88 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,135 - INFO  [WorkerReceiver[myid=4]] - Notification: 2 
(n.leader), 0x880000002c (n.zxid), 0x6 (n.round), FOLLOWING (n.state), 3 
(n.sid), 0x88 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2013-10-28 18:43:18,136 - INFO  [WorkerReceiver[myid=4]] - Notification: 2 
(n.leader), 0x880000002c (n.zxid), 0xffffffffffffffff (n.round), FOLLOWING 
(n.state), 1 (n.sid), 0x89 (n.peerEPoch), LOOKING (my state)0 (n.config version)
{noformat}

I am guessing IGNOREVALUE (0xffffffffffffffff) as the round value is causing 
issues? What was the expected behavior here (i.e.: when dealing with cluster 
members without this patch during an upgrade)?

> ZooKeeper server unable to join established ensemble
> ----------------------------------------------------
>
>                 Key: ZOOKEEPER-1732
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1732
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.5
>         Environment: Windows 7, Java 1.7
>            Reporter: Germán Blanco
>            Assignee: Germán Blanco
>            Priority: Blocker
>             Fix For: 3.4.6, 3.5.0
>
>         Attachments: CREATE_INCONSISTENCIES_patch.txt, zklog.tar.gz, 
> ZOOKEEPER-1732-3.4.patch, ZOOKEEPER-1732-3.4.patch, ZOOKEEPER-1732-3.4.patch, 
> ZOOKEEPER-1732-3.4.patch, ZOOKEEPER-1732-b3.4.patch, 
> ZOOKEEPER-1732-b3.4.patch, ZOOKEEPER-1732.patch, ZOOKEEPER-1732.patch, 
> ZOOKEEPER-1732.patch, ZOOKEEPER-1732.patch, ZOOKEEPER-1732.patch
>
>
> I have a test in which I do a rolling restart of three ZooKeeper servers and 
> it was failing from time to time.
> I ran the tests in a loop until the failure came out and it seems that at 
> some point one of the servers is unable to join the enssemble formed by the 
> other two.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to