Paolo Patierno created ZOOKEEPER-4708:
-----------------------------------------
Summary: ZooKeeper 3.6.4 quorum failing due to <unresolved> address
Key: ZOOKEEPER-4708
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4708
Project: ZooKeeper
Issue Type: Bug
Affects Versions: 3.8.1, 3.6.4
Reporter: Paolo Patierno
We work on the Strimzi project which is about deploying an Apache Kafka cluster
on Kubernetes together with a ZooKeeper ensamble.
Until ZooKeeper version 3.6.3 (brought by Kafka 3.4.0), there were no issues
when running on minikube for development purposes.
With using ZooKeeper version 3.6.4 (brought by Kafka 3.4.1), we started to have
issues during the quorum formation and leader election.
The first one was about ZooKeeper pods not able to bind the quorum port 3888 to
the Cluster IP but during the DNS resolution they get the loopback address
instead.
Following a possible log at ZooKeeper startup where you can see the binding at
127.0.0.1:3888 instead of something like 172.17.0.4:3888 (so getting a valid
not loopback IP address).
INFO 3 is accepting connections now, my election bind port:
my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888
(org.apache.zookeeper.server.quorum.QuorumCnxManager)
[ListenerHandler-my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888]
This specific issue had two solutions: using quorumListenOnAllIPs=true on
ZooKeeper configuration or binding to 0.0.0.0 address.
Anyway it is actually not clear why it wasn't needed until 3.6.3, but needed
for getting 3.6.4 working. What is changed from this perspective?
Said that, While binding to 0.0.0.0 seems to work fine, using the
quorumListenOnAllIPs=true doesn't.
Assuming a ZooKeeper ensamble with 3 nodes, Getting the log of the current
ZooKeeper leader (ID=3) we see the following.
(Starting with ** you can see some additional logs added to
{{org.apache.zookeeper.server.quorum.Leader#getDesignatedLeader}} in order to
get more information.)
2023-06-19 12:32:51,990 INFO Have quorum of supporters, sids: [[1, 3],[1, 3]];
starting up and setting last processed zxid: 0x100000000
(org.apache.zookeeper.server.quorum.Leader)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-06-19 12:32:51,990 INFO **
newQVAcksetPair.getQuorumVerifier().getVotingMembers().get(self.getId()).addr =
my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888
(org.apache.zookeeper.server.quorum.Leader)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-06-19 12:32:51,990 INFO ** self.getQuorumAddress() =
my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/<unresolved>:2888
(org.apache.zookeeper.server.quorum.Leader)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-06-19 12:32:51,992 INFO ** qs.addr
my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888,
qs.electionAddr
my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:3888,
qs.clientAddr/127.0.0.1:12181 (org.apache.zookeeper.server.quorum.QuorumPeer)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-06-19 12:32:51,992 DEBUG zookeeper (org.apache.zookeeper.common.PathTrie)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-06-19 12:32:51,993 WARN Restarting Leader Election
(org.apache.zookeeper.server.quorum.QuorumPeer)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
So the leader is ZooKeeper with ID=3 and it was ACKed by the ZooKeeper node
ID=1.
As you can see we are in the {{Leader#startZkServer}} method, and because of
the reconfiguration enabled, the designatedLeader is processed. The problem is
that the {{Leader#getDesignatedLeader}} is not returning “self” as leader but
another one (ID=1), because of the difference in the quorum address.
>From the above log, it’s not an actual difference in terms of addresses but
>the {{self.getQuorumAddress()}} is returning an <unresolved> (even if it’s
>still the same hostname related to ZooKeeper-2 instance). This difference
>causes the allowedToCommit=false, meanwhile the ZooKeeper-2 is still reported
>as leader but it’s not able to commit, so prevents any requests and the
>ZooKeeper ensemble gets stuck.
2023-06-19 12:32:51,996 WARN Suggested leader: 1
(org.apache.zookeeper.server.quorum.QuorumPeer)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
2023-06-19 12:32:51,996 WARN This leader is not the designated leader, it will
be initialized with allowedToCommit = false
(org.apache.zookeeper.server.quorum.Leader)
[QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
The overall issue could be related to DNS problems, with DNS records not
registered yet during pod initialization (where ZooKeeper is running on
Kubernetes). But we don’t understand why it’s not able to recover somehow.
What we don't get a reason is why ZooKeeper 3.6.3 didn't need any binding
specific configuration and was working just fine, while the new 3.6.4 needs
that. Other than that, why we see the above described issue after the binding
works fine.
We also tried by using latest ZooKeeper 3.8.1 facing the same issue.
It's worth mentioning that the issue wasn't reproducible in environment like
AKS or OpenShift, but if it's somehow slow DNS related, it could happen anyway
(even not on minikube as in our use case).
FYI, the discussion we had in the upstream Strimzi community
https://github.com/strimzi/strimzi-kafka-operator/issues/8675
--
This message was sent by Atlassian Jira
(v8.20.10#820010)