[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927833#action_12927833
 ] 

Alexandre Hardy commented on ZOOKEEPER-917:
-------------------------------------------

Excerpt from logs on 192.168.130.10:
{noformat}
2010-11-02 09:36:28,060 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: New election: 4294967742
2010-11-02 09:36:28,061 WARN org.apache.zookeeper.server.NIOServerCnxn: 
Exception causing close of session 0x0 due to java.io.IOException: 
ZooKeeperServer not running
2010-11-02 09:36:28,061 INFO org.apache.zookeeper.server.NIOServerCnxn: closing 
session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected 
local=/192.168.130.10:2181 remote=/192.168.130.10:37781]
2010-11-02 09:36:28,061 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 0, 
4294967742, 2, 0, LOOKING, LOOKING, 0
2010-11-02 09:36:28,063 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Adding vote
2010-11-02 09:36:28,064 WARN org.apache.zookeeper.server.NIOServerCnxn: 
Exception causing close of session 0x0 due to java.io.IOException: 
ZooKeeperServer not running
2010-11-02 09:36:28,064 INFO org.apache.zookeeper.server.NIOServerCnxn: closing 
session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected 
local=/192.168.130.10:2181 remote=/192.168.130.14:50222]
2010-11-02 09:36:28,064 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 2, -1, 1, 
0, LOOKING, FOLLOWING, 1
2010-11-02 09:36:28,065 WARN org.apache.zookeeper.server.NIOServerCnxn: 
Exception causing close of session 0x0 due to java.io.IOException: 
ZooKeeperServer not running
2010-11-02 09:36:28,065 INFO org.apache.zookeeper.server.NIOServerCnxn: closing 
session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected 
local=/192.168.130.10:2181 remote=/192.168.130.14:50223]
2010-11-02 09:36:28,068 WARN org.apache.zookeeper.server.NIOServerCnxn: 
Exception causing close of session 0x0 due to java.io.IOException: 
ZooKeeperServer not running
2010-11-02 09:36:28,068 INFO org.apache.zookeeper.server.NIOServerCnxn: closing 
session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected 
local=/192.168.130.10:2181 remote=/192.168.130.12:59044]
2010-11-02 09:36:28,073 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 2, -1, 1, 
0, LOOKING, LEADING, 2
2010-11-02 09:36:28,073 WARN org.apache.zookeeper.server.NIOServerCnxn: 
Exception causing close of session 0x0 due to java.io.IOException: 
ZooKeeperServer not running
2010-11-02 09:36:28,073 INFO org.apache.zookeeper.server.NIOServerCnxn: closing 
session:0x0 NIOServerCnxn: java.nio.channels.SocketChannel[connected 
local=/192.168.130.10:2181 remote=/192.168.130.10:37786]
2010-11-02 09:36:28,073 INFO org.apache.zookeeper.server.quorum.QuorumPeer: 
FOLLOWING
2010-11-02 09:36:28,073 INFO org.apache.zookeeper.server.ZooKeeperServer: 
Created server 
2010-11-02 09:36:28,074 INFO org.apache.zookeeper.server.quorum.Follower: 
Following zookeeper3/192.168.130.13:2888
{noformat}

Excerpt from logs on 192.168.130.11:
{noformat}
2010-11-02 09:36:14,065 INFO 
org.apache.zookeeper.server.quorum.QuorumPeerConfig: Defaulting to majority 
quorums
2010-11-02 09:36:14,120 INFO org.apache.zookeeper.server.quorum.QuorumPeerMain: 
Starting quorum peer
2010-11-02 09:36:14,172 INFO 
org.apache.zookeeper.server.quorum.QuorumCnxManager: My election bind port: 3888
2010-11-02 09:36:14,182 INFO org.apache.zookeeper.server.quorum.QuorumPeer: 
LOOKING
2010-11-02 09:36:14,183 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: New election: -1
2010-11-02 09:36:14,191 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 2, -1, 1, 
2, LOOKING, LOOKING, 2
2010-11-02 09:36:14,191 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Adding vote
2010-11-02 09:36:14,193 WARN 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while waiting 
for message on queue
java.lang.InterruptedException
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1952)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
    at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:345)
    at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:532)
2010-11-02 09:36:14,194 WARN 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Send worker leaving thread
2010-11-02 09:36:14,194 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Notification: 2, -1, 1, 
2, LOOKING, LOOKING, 1
2010-11-02 09:36:14,194 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: Adding vote
2010-11-02 09:36:14,195 WARN 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Interrupted while waiting 
for message on queue
java.lang.InterruptedException
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1952)
    at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
    at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:345)
    at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:532)
2010-11-02 09:36:14,195 WARN 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Send worker leaving thread
2010-11-02 09:36:14,202 WARN 
org.apache.zookeeper.server.quorum.QuorumCnxManager: Connection broken:
java.nio.channels.AsynchronousCloseException
    at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:281)
    at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:593)
2010-11-02 09:36:14,401 INFO 
org.apache.zookeeper.server.quorum.FastLeaderElection: About to leave 
instance:2, -1, 2, LEADING
2010-11-02 09:36:14,402 INFO org.apache.zookeeper.server.quorum.QuorumPeer: 
LEADING
{noformat}

I'm not sure why 192.168.130.13 ended up as the leader when it did not have the 
most up to date transaction ID. Also, I don't see the notification messages of 
the other nodes in the logs of 192.168.130.13.

Is there any reason why other nodes would accept 192.168.130.13 as the leader?

> Leader election selected incorrect leader
> -----------------------------------------
>
>                 Key: ZOOKEEPER-917
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-917
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: leaderElection, server
>    Affects Versions: 3.2.2
>         Environment: Cloudera distribution of zookeeper (patched to never 
> cache DNS entries)
> Debian lenny
>            Reporter: Alexandre Hardy
>            Priority: Critical
>         Attachments: zklogs-20101102144159SAST.tar.gz
>
>
> We had three nodes running zookeeper:
>   * 192.168.130.10
>   * 192.168.130.11
>   * 192.168.130.14
> 192.168.130.11 failed, and was replaced by a new node 192.168.130.13 
> (automated startup). The new node had not participated in any zookeeper 
> quorum previously. The node 192.148.130.11 was permanently removed from 
> service and could not contribute to the quorum any further (powered off).
> DNS entries were updated for the new node to allow all the zookeeper servers 
> to find the new node.
> The new node 192.168.130.13 was selected as the LEADER, despite the fact that 
> it had not seen the latest zxid.
> This particular problem has not been verified with later versions of 
> zookeeper, and no attempt has been made to reproduce this problem as yet.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to