I think this is the problem and it looks like a bug
Exception when following the leader
java.lang.IllegalArgumentException
at
java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1314)
at
java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1202)
at java.util.concurrent.Executors.newFixedThreadPool(Executors.
java:89)
at
org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:275)
at
org.apache.zookeeper.server.qu
I can't check the code right know.
But I am sure this is something to address with a bug fix
One question: are nodes reachable with PING each to each other?
In 3.6.0 we have added reachability checks that may interfere with your
setup
We are addressing such kind of issues in 3.6.1 that hopefully will be
released at the beginning of April
Enrico
Enrico
Il Lun 23 Mar 2020, 20:49 rld244 <[email protected]> ha scritto:
> [email protected]
> Hi,
>
> I'm trying to get a 3 node Zookeeper cluster up and running. I am doing an
> initial install with Ansible from
> https://galaxy.ansible.com/lean_delivery/zookeeper which installs version
> 3.6.0. This gets launched with systemd which is just calling
> bin/zkServer.sh
> start-foreground.
>
> The Ansible role sets the myid's and has a zoo.cfg.dynamic file that is
> called in zoo.cfg. In my efforts to get this to work I've removed the
> reference to the dynamic file.
>
> My zoo.cfg looks like this now.
>
> tickTime=2000
> dataDir=/var/lib/zookeeper
> dataLogDir=/var/log/zookeeper
> initLimit=30
> syncLimit=15
> clientPort=2181
> server.1=0.0.0.0:2888:3888
> server.2=< fqdn of server 2 >:2888:3888
> server.3=< fqdn of server 3 >:2888:3888
> 4lw.commands.whitelist=*
>
> For each server 0.0.0.0 is swapped out in the appropriate place and the
> myid's are correct.
> I upped the initLimit and syncLimit at some point to see if that would make
> a difference.
>
> I've made sure to start the Zookeeper services on each server in the
> correct
> order.
>
> But with all that when I try bin/zkServer.sh status I always get the
> following error
>
> /bin/java
> ZooKeeper JMX enabled by default
> Using config: /opt/zookeeper-3.6.0/bin/../conf/zoo.cfg
> Client port found: 2181. Client address: localhost.
> Error contacting service. It is probably not running.
>
> I think Zookeeper is supposed to be listening on 2888 but am not seeing
> that
> when I try
> netstat -tulpn | grep LISTEN
>
> tcp 0 0 0.0.0.0:22 0.0.0.0:*
> LISTEN
> 1260/sshd
> tcp 0 0 127.0.0.1:25 0.0.0.0:*
> LISTEN
> 1222/master
> tcp 0 0 0.0.0.0:111 0.0.0.0:*
> LISTEN
> 1/systemd
> tcp6 0 0 :::22 :::*
> LISTEN
> 1260/sshd
> tcp6 0 0 :::8983 :::*
> LISTEN
> 1129/java
> tcp6 0 0 ::1:25 :::*
> LISTEN
> 1222/master
> tcp6 0 0 :::443 :::*
> LISTEN
> 1028/httpd
> tcp6 0 0 :::40093 :::*
> LISTEN
> 3354/java
> tcp6 0 0 :::2181 :::*
> LISTEN
> 3354/java
> tcp6 0 0 127.0.0.1:7983 :::*
> LISTEN
> 1129/java
> tcp6 0 0 :::111 :::*
> LISTEN
> 1/systemd
> tcp6 0 0 :::3888 :::*
> LISTEN
> 3354/java
> tcp6 0 0 :::8080 :::*
> LISTEN
> 3354/java
> tcp6 0 0 :::80 :::*
> LISTEN
> 1028/httpd
>
> I'm seeing this in my logs.
>
> 2020-03-23 14:38:38,738 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):ZooKeeperServer@1470
> ]
> - The max bytes for all large requests are set to 104857600
> 2020-03-23 14:38:38,738 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):ZooKeeperServer@1484
> ]
> - The large request threshold is set to -1
> 2020-03-23 14:38:38,738 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):ZooKeeperServer@329
> ]
> - Created server with tickTime 2000 minSessionTimeout 4000
> maxSessionTimeout
> 40000 clientPortListenBacklog -1 datadir /var/log/zookeeper/version-2
> snapdir /var/lib/zookeeper/version-2
> 2020-03-23 14:38:38,738 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Follower@75
> ]
> - FOLLOWING - LEADER ELECTION TOOK - 59223 MS
> 2020-03-23 14:38:38,738 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@863
> ]
> - Peer state changed: following - discovery
> 2020-03-23 14:38:39,740 [myid:1] - WARN
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Follower@129
> ]
> - Exception when following the leader
> java.lang.IllegalArgumentException
> at
>
> java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1314)
> at
>
> java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1202)
> at
> java.util.concurrent.Executors.newFixedThreadPool(Executors.java:89)
> at
>
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:275)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:87)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1455)
> 2020-03-23 14:38:39,740 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Follower@292
> ]
> - shutdown Follower
> 2020-03-23 14:38:39,740 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@863
> ]
> - Peer state changed: looking
> 2020-03-23 14:38:39,740 [myid:1] - WARN
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1501
> ]
> - PeerState set to LOOKING
> 2020-03-23 14:38:39,741 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1371
> ]
> - LOOKING
> 2020-03-23 14:38:39,741 [myid:1] - INFO
>
> [QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@931
> ]
> - New election. My id = 1, proposed zxid=0x0
> 2020-03-23 14:38:39,741 [myid:1] - INFO
> [WorkerSender[myid=1]:QuorumCnxManager@462] - Have smaller server
> identifier, so dropping the connection: (2, 1)
> 2020-03-23 14:38:39,742 [myid:1] - INFO
> [WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:1,
> n.round:0x27, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
>
> 2020-03-23 15:08:48,614 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:3, n.state:LEADING, n.leader:3,
> n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:49,406 [myid:2] - INFO
> [ListenerHandler-/0.0.0.0:3888
> :QuorumCnxManager$Listener$ListenerHandler@1052]
> - Received connection request /10.92.108.38:51428
> 2020-03-23 15:08:49,407 [myid:2] - WARN
> [RecvWorker:1:QuorumCnxManager$RecvWorker@1374] - Connection broken for id
> 1, my id = 2
> java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at
> java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1362)
> 2020-03-23 15:08:49,407 [myid:2] - WARN
> [RecvWorker:1:QuorumCnxManager$RecvWorker@1380] - Interrupting SendWorker
> 2020-03-23 15:08:49,407 [myid:2] - WARN
> [SendWorker:1:QuorumCnxManager$SendWorker@1262] - Interrupted while
> waiting
> for message on queue
> java.lang.InterruptedException
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> at
>
> org.apache.zookeeper.util.CircularBlockingQueue.poll(CircularBlockingQueue.java:105)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1425)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$900(QuorumCnxManager.java:97)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1251)
> 2020-03-23 15:08:49,407 [myid:2] - WARN
> [SendWorker:1:QuorumCnxManager$SendWorker@1274] - Send worker leaving
> thread
> id 1 my id = 2
> 2020-03-23 15:08:49,407 [myid:2] - WARN
> [RecvWorker:2:QuorumCnxManager$RecvWorker@1374] - Connection broken for id
> 2, my id = 2
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1362)
> 2020-03-23 15:08:49,407 [myid:2] - WARN
> [RecvWorker:2:QuorumCnxManager$RecvWorker@1380] - Interrupting SendWorker
> 2020-03-23 15:08:49,410 [myid:2] - INFO
> [ListenerHandler-/0.0.0.0:3888
> :QuorumCnxManager$Listener$ListenerHandler@1052]
> - Received connection request /10.92.108.99:41828
> 2020-03-23 15:08:49,420 [myid:2] - WARN
> [SendWorker:2:QuorumCnxManager$SendWorker@1262] - Interrupted while
> waiting
> for message on queue
> java.lang.InterruptedException
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> at
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
> at
>
> org.apache.zookeeper.util.CircularBlockingQueue.poll(CircularBlockingQueue.java:105)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1425)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$900(QuorumCnxManager.java:97)
> at
>
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1251)
> 2020-03-23 15:08:49,420 [myid:2] - WARN
> [SendWorker:2:QuorumCnxManager$SendWorker@1274] - Send worker leaving
> thread
> id 2 my id = 2
> 2020-03-23 15:08:49,421 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2,
> n.round:0x45, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:51,021 [myid:2] - INFO
>
> [QuorumPeer[myid=2](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@966
> ]
> - Notification time out: 3200
> 2020-03-23 15:08:51,021 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2,
> n.round:0x45, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:51,021 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2,
> n.round:0x45, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:51,022 [myid:2] - INFO
> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:3, n.state:LEADING, n.leader:3,
> n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
>
> 2020-03-23 15:08:37,187 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LEADING; n.sid:1, n.state:LOOKING, n.leader:1,
> n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:45,183 [myid:3] - WARN
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1471
> ]
> - Unexpected exception
> java.lang.InterruptedException: Timeout while waiting for epoch from quorum
> at
>
> org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1425)
> at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:599)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1468)
> 2020-03-23 15:08:45,183 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Leader@792
> ]
> - Shutting down
> 2020-03-23 15:08:45,183 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Leader@798
> ]
> - Shutdown called. For the reason Forcing shutdown
> 2020-03-23 15:08:45,183 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@863
> ]
> - Peer state changed: looking
> 2020-03-23 15:08:45,183 [myid:3] - WARN
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1501
> ]
> - PeerState set to LOOKING
> 2020-03-23 15:08:45,183 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1371
> ]
> - LOOKING
> 2020-03-23 15:08:45,183 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@931
> ]
> - New election. My id = 3, proposed zxid=0x0
> 2020-03-23 15:08:45,184 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:3, n.state:LOOKING, n.leader:3,
> n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:45,187 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:3,
> n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:45,188 [myid:3] - INFO
> [WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
> Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:3,
> n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
> n.config version:0x0
> 2020-03-23 15:08:45,189 [myid:3] - WARN
> [LearnerCnxAcceptorHandler-/0.0.0.0:2888
> :Leader$LearnerCnxAcceptor$LearnerCnxAcceptorHandler@523]
> - Exception while shutting down acceptor.
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketAccept(Native Method)
> at
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
> at java.net.ServerSocket.implAccept(ServerSocket.java:560)
> at java.net.ServerSocket.accept(ServerSocket.java:528)
> at
>
> org.apache.zookeeper.server.quorum.Leader$LearnerCnxAcceptor$LearnerCnxAcceptorHandler.acceptConnections(Leader.java:510)
> at
>
> org.apache.zookeeper.server.quorum.Leader$LearnerCnxAcceptor$LearnerCnxAcceptorHandler.run(Leader.java:493)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 2020-03-23 15:08:45,389 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@857
> ]
> - Peer state changed: leading
> 2020-03-23 15:08:45,389 [myid:3] - INFO
>
> [QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1465
> ]
> - LEADING
>
>
> Don't know what to do here. Any help would be greatly appreciated.
>
>
>
>
> --
> Sent from: http://zookeeper-user.578899.n2.nabble.com/
>