[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12632111#action_12632111
 ] 

austin edited comment on ZOOKEEPER-127 at 9/17/08 11:36 PM:
----------------------------------------------------------------------

After several more runs of our unit test using the patched algorithm 3, the 
test hangs as the service repeatedly tries to reelect the killed leader. This 
behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 
0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is 
from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:[EMAIL PROTECTED] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - unable to parse 
zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - New election: 
8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:[EMAIL PROTECTED] - Cannot 
open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Created server 
with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Following 
/10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:[EMAIL PROTECTED] - Unexpected 
exception
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:519)
        at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:[EMAIL PROTECTED] - FIXMSG
java.lang.Exception: shutdown Follower
        at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]


      was (Author: austin):
    After about 6 runs of our unit test the test hangs as the service 
repeatedly tries to reelect the killed leader (similar to ZOOKEEPER-131 with 
algorithms 0 and 1). 


After several more runs of our unit test using the patched algorithm 3, the 
test hangs as the service repeatedly tries to reelect the killed leader. This 
behavior is similar to ZOOKEEPER-131 which we had experienced using algorithms 
0 and 1.

Server 10 is 10.50.65.40 and has been explicitly killed. The following log is 
from server 5, which mirrors logs on all the other servers.

Any idea what's happening here?

2008-09-18 00:28:20,029 - INFO  [QuorumPeer:[EMAIL PROTECTED] - LOOKING
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - unable to parse 
zxid string into long: txt
2008-09-18 00:28:20,029 - WARN  [QuorumPeer:[EMAIL PROTECTED] - New election: 
8589935405
2008-09-18 00:28:20,031 - WARN  [WorkerSender Thread:[EMAIL PROTECTED] - Cannot 
open channel to 10( java.net.ConnectException: Connection refused)
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - FOLLOWING
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Created server 
with dataDir:/zookeeper_data/5_data dataLogDir:/zookeeper_data/5_data tickT
ime:2000
2008-09-18 00:28:20,031 - INFO  [QuorumPeer:[EMAIL PROTECTED] - Following 
/10.50.65.40:2888

[[[ exception below repeats 5 times ]]]

2008-09-18 00:28:20,032 - WARN  [QuorumPeer:[EMAIL PROTECTED] - Unexpected 
exception
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:519)
        at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:137)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:405)

[[[ then the follower is restarted ]]]

2008-09-18 00:28:24,049 - ERROR [QuorumPeer:[EMAIL PROTECTED] - FIXMSG
java.lang.Exception: shutdown Follower
        at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:370)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:409)

[[[ at this point the log repeats from the beginning ]]]

  
> Use of non-standard election ports in config breaks services
> ------------------------------------------------------------
>
>                 Key: ZOOKEEPER-127
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-127
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.0.0
>            Reporter: Mark Harwood
>            Assignee: Flavio Paiva Junqueira
>            Priority: Minor
>             Fix For: 3.0.0
>
>         Attachments: mhPortChanges.patch, ZOOKEEPER-127.patch, 
> ZOOKEEPER-127.patch, ZOOKEEPER-127.patch
>
>
> In QuorumCnxManager.toSend there is a call to create a connection as follows:
>     channel = SocketChannel.open(new InetSocketAddress(addr, port));
> Unfortunately "addr" is the ip address of a remote server while "port" is the 
> electionPort of *this* server.
> As an example, given this configuration (taken from my zoo.cfg)
>   server.1=10.20.9.254:2881
>   server.2=10.20.9.9:2882
>   server.3=10.20.9.254:2883
> Server 3 was observed trying to make a connection to host 10.20.9.9 on port 
> 2883 and obviously failing.
> In tests where all machines use the same electionPort this bug would not 
> manifest itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to