[email protected]
Hi,
I'm trying to get a 3 node Zookeeper cluster up and running. I am doing an
initial install with Ansible from
https://galaxy.ansible.com/lean_delivery/zookeeper which installs version
3.6.0. This gets launched with systemd which is just calling bin/zkServer.sh
start-foreground.
The Ansible role sets the myid's and has a zoo.cfg.dynamic file that is
called in zoo.cfg. In my efforts to get this to work I've removed the
reference to the dynamic file.
My zoo.cfg looks like this now.
tickTime=2000
dataDir=/var/lib/zookeeper
dataLogDir=/var/log/zookeeper
initLimit=30
syncLimit=15
clientPort=2181
server.1=0.0.0.0:2888:3888
server.2=< fqdn of server 2 >:2888:3888
server.3=< fqdn of server 3 >:2888:3888
4lw.commands.whitelist=*
For each server 0.0.0.0 is swapped out in the appropriate place and the
myid's are correct.
I upped the initLimit and syncLimit at some point to see if that would make
a difference.
I've made sure to start the Zookeeper services on each server in the correct
order.
But with all that when I try bin/zkServer.sh status I always get the
following error
/bin/java
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.6.0/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Error contacting service. It is probably not running.
I think Zookeeper is supposed to be listening on 2888 but am not seeing that
when I try
netstat -tulpn | grep LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
1260/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
1222/master
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
1/systemd
tcp6 0 0 :::22 :::* LISTEN
1260/sshd
tcp6 0 0 :::8983 :::* LISTEN
1129/java
tcp6 0 0 ::1:25 :::* LISTEN
1222/master
tcp6 0 0 :::443 :::* LISTEN
1028/httpd
tcp6 0 0 :::40093 :::* LISTEN
3354/java
tcp6 0 0 :::2181 :::* LISTEN
3354/java
tcp6 0 0 127.0.0.1:7983 :::* LISTEN
1129/java
tcp6 0 0 :::111 :::* LISTEN
1/systemd
tcp6 0 0 :::3888 :::* LISTEN
3354/java
tcp6 0 0 :::8080 :::* LISTEN
3354/java
tcp6 0 0 :::80 :::* LISTEN
1028/httpd
I'm seeing this in my logs.
2020-03-23 14:38:38,738 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):ZooKeeperServer@1470]
- The max bytes for all large requests are set to 104857600
2020-03-23 14:38:38,738 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):ZooKeeperServer@1484]
- The large request threshold is set to -1
2020-03-23 14:38:38,738 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):ZooKeeperServer@329]
- Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout
40000 clientPortListenBacklog -1 datadir /var/log/zookeeper/version-2
snapdir /var/lib/zookeeper/version-2
2020-03-23 14:38:38,738 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Follower@75]
- FOLLOWING - LEADER ELECTION TOOK - 59223 MS
2020-03-23 14:38:38,738 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@863]
- Peer state changed: following - discovery
2020-03-23 14:38:39,740 [myid:1] - WARN
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Follower@129]
- Exception when following the leader
java.lang.IllegalArgumentException
at
java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1314)
at
java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1202)
at java.util.concurrent.Executors.newFixedThreadPool(Executors.java:89)
at
org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:275)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:87)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1455)
2020-03-23 14:38:39,740 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Follower@292]
- shutdown Follower
2020-03-23 14:38:39,740 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@863]
- Peer state changed: looking
2020-03-23 14:38:39,740 [myid:1] - WARN
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1501]
- PeerState set to LOOKING
2020-03-23 14:38:39,741 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1371]
- LOOKING
2020-03-23 14:38:39,741 [myid:1] - INFO
[QuorumPeer[myid=1](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@931]
- New election. My id = 1, proposed zxid=0x0
2020-03-23 14:38:39,741 [myid:1] - INFO
[WorkerSender[myid=1]:QuorumCnxManager@462] - Have smaller server
identifier, so dropping the connection: (2, 1)
2020-03-23 14:38:39,742 [myid:1] - INFO
[WorkerReceiver[myid=1]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:1,
n.round:0x27, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:48,614 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:3, n.state:LEADING, n.leader:3,
n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:49,406 [myid:2] - INFO
[ListenerHandler-/0.0.0.0:3888:QuorumCnxManager$Listener$ListenerHandler@1052]
- Received connection request /10.92.108.38:51428
2020-03-23 15:08:49,407 [myid:2] - WARN
[RecvWorker:1:QuorumCnxManager$RecvWorker@1374] - Connection broken for id
1, my id = 2
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1362)
2020-03-23 15:08:49,407 [myid:2] - WARN
[RecvWorker:1:QuorumCnxManager$RecvWorker@1380] - Interrupting SendWorker
2020-03-23 15:08:49,407 [myid:2] - WARN
[SendWorker:1:QuorumCnxManager$SendWorker@1262] - Interrupted while waiting
for message on queue
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at
org.apache.zookeeper.util.CircularBlockingQueue.poll(CircularBlockingQueue.java:105)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1425)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager.access$900(QuorumCnxManager.java:97)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1251)
2020-03-23 15:08:49,407 [myid:2] - WARN
[SendWorker:1:QuorumCnxManager$SendWorker@1274] - Send worker leaving thread
id 1 my id = 2
2020-03-23 15:08:49,407 [myid:2] - WARN
[RecvWorker:2:QuorumCnxManager$RecvWorker@1374] - Connection broken for id
2, my id = 2
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1362)
2020-03-23 15:08:49,407 [myid:2] - WARN
[RecvWorker:2:QuorumCnxManager$RecvWorker@1380] - Interrupting SendWorker
2020-03-23 15:08:49,410 [myid:2] - INFO
[ListenerHandler-/0.0.0.0:3888:QuorumCnxManager$Listener$ListenerHandler@1052]
- Received connection request /10.92.108.99:41828
2020-03-23 15:08:49,420 [myid:2] - WARN
[SendWorker:2:QuorumCnxManager$SendWorker@1262] - Interrupted while waiting
for message on queue
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at
org.apache.zookeeper.util.CircularBlockingQueue.poll(CircularBlockingQueue.java:105)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1425)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager.access$900(QuorumCnxManager.java:97)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1251)
2020-03-23 15:08:49,420 [myid:2] - WARN
[SendWorker:2:QuorumCnxManager$SendWorker@1274] - Send worker leaving thread
id 2 my id = 2
2020-03-23 15:08:49,421 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2,
n.round:0x45, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:51,021 [myid:2] - INFO
[QuorumPeer[myid=2](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@966]
- Notification time out: 3200
2020-03-23 15:08:51,021 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2,
n.round:0x45, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:51,021 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2,
n.round:0x45, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:51,022 [myid:2] - INFO
[WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:3, n.state:LEADING, n.leader:3,
n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:37,187 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LEADING; n.sid:1, n.state:LOOKING, n.leader:1,
n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:45,183 [myid:3] - WARN
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1471]
- Unexpected exception
java.lang.InterruptedException: Timeout while waiting for epoch from quorum
at
org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1425)
at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:599)
at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1468)
2020-03-23 15:08:45,183 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Leader@792]
- Shutting down
2020-03-23 15:08:45,183 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):Leader@798]
- Shutdown called. For the reason Forcing shutdown
2020-03-23 15:08:45,183 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@863]
- Peer state changed: looking
2020-03-23 15:08:45,183 [myid:3] - WARN
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1501]
- PeerState set to LOOKING
2020-03-23 15:08:45,183 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1371]
- LOOKING
2020-03-23 15:08:45,183 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):FastLeaderElection@931]
- New election. My id = 3, proposed zxid=0x0
2020-03-23 15:08:45,184 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:3, n.state:LOOKING, n.leader:3,
n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:45,187 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:3,
n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:45,188 [myid:3] - INFO
[WorkerReceiver[myid=3]:FastLeaderElection$Messenger$WorkerReceiver@376] -
Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:3,
n.round:0x44, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2,
n.config version:0x0
2020-03-23 15:08:45,189 [myid:3] - WARN
[LearnerCnxAcceptorHandler-/0.0.0.0:2888:Leader$LearnerCnxAcceptor$LearnerCnxAcceptorHandler@523]
- Exception while shutting down acceptor.
java.net.SocketException: Socket closed
at java.net.PlainSocketImpl.socketAccept(Native Method)
at
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:560)
at java.net.ServerSocket.accept(ServerSocket.java:528)
at
org.apache.zookeeper.server.quorum.Leader$LearnerCnxAcceptor$LearnerCnxAcceptorHandler.acceptConnections(Leader.java:510)
at
org.apache.zookeeper.server.quorum.Leader$LearnerCnxAcceptor$LearnerCnxAcceptorHandler.run(Leader.java:493)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-03-23 15:08:45,389 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@857]
- Peer state changed: leading
2020-03-23 15:08:45,389 [myid:3] - INFO
[QuorumPeer[myid=3](plain=[0:0:0:0:0:0:0:0]:2181)(secure=disabled):QuorumPeer@1465]
- LEADING
Don't know what to do here. Any help would be greatly appreciated.
--
Sent from: http://zookeeper-user.578899.n2.nabble.com/