[
https://issues.apache.org/jira/browse/ZOOKEEPER-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kezhu Wang reassigned ZOOKEEPER-4708:
-------------------------------------
Assignee: Luke Chen
> ZooKeeper 3.6.4 quorum failing due to <unresolved> address
> ----------------------------------------------------------
>
> Key: ZOOKEEPER-4708
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4708
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.6.4, 3.8.1
> Reporter: Paolo Patierno
> Assignee: Luke Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.10.0
>
> Time Spent: 4h 40m
> Remaining Estimate: 0h
>
> We work on the Strimzi project which is about deploying an Apache Kafka
> cluster on Kubernetes together with a ZooKeeper ensamble.
> Until ZooKeeper version 3.6.3 (brought by Kafka 3.4.0), there were no issues
> when running on minikube for development purposes.
> With using ZooKeeper version 3.6.4 (brought by Kafka 3.4.1), we started to
> have issues during the quorum formation and leader election.
> The first one was about ZooKeeper pods not able to bind the quorum port 3888
> to the Cluster IP but during the DNS resolution they get the loopback address
> instead.
> Following a possible log at ZooKeeper startup where you can see the binding
> at 127.0.0.1:3888 instead of something like 172.17.0.4:3888 (so getting a
> valid not loopback IP address).
>
> {code:java}
> INFO 3 is accepting connections now, my election bind port:
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888
> (org.apache.zookeeper.server.quorum.QuorumCnxManager)
> [ListenerHandler-my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/127.0.0.1:3888]{code}
> This specific issue had two solutions: using quorumListenOnAllIPs=true on
> ZooKeeper configuration or binding to 0.0.0.0 address.
> Anyway it is actually not clear why it wasn't needed until 3.6.3, but needed
> for getting 3.6.4 working. What is changed from this perspective?
> Said that, While binding to 0.0.0.0 seems to work fine, using the
> quorumListenOnAllIPs=true doesn't.
> Assuming a ZooKeeper ensamble with 3 nodes, Getting the log of the current
> ZooKeeper leader (ID=3) we see the following.
> (Starting with ** you can see some additional logs added to
> {{org.apache.zookeeper.server.quorum.Leader#getDesignatedLeader}} in order to
> get more information.)
> {code:java}
> 2023-06-19 12:32:51,990 INFO Have quorum of supporters, sids: [[1, 3],[1,
> 3]]; starting up and setting last processed zxid: 0x100000000
> (org.apache.zookeeper.server.quorum.Leader)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,990 INFO **
> newQVAcksetPair.getQuorumVerifier().getVotingMembers().get(self.getId()).addr
> =
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888
> (org.apache.zookeeper.server.quorum.Leader)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,990 INFO ** self.getQuorumAddress() =
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/<unresolved>:2888
> (org.apache.zookeeper.server.quorum.Leader)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,992 INFO ** qs.addr
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:2888,
> qs.electionAddr
> my-cluster-zookeeper-2.my-cluster-zookeeper-nodes.default.svc/172.17.0.6:3888,
> qs.clientAddr/127.0.0.1:12181
> (org.apache.zookeeper.server.quorum.QuorumPeer)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,992 DEBUG zookeeper
> (org.apache.zookeeper.common.PathTrie)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,993 WARN Restarting Leader Election
> (org.apache.zookeeper.server.quorum.QuorumPeer)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] {code}
> So the leader is ZooKeeper with ID=3 and it was ACKed by the ZooKeeper node
> ID=1.
> As you can see we are in the {{Leader#startZkServer}} method, and because of
> the reconfiguration enabled, the designatedLeader is processed. The problem
> is that the {{Leader#getDesignatedLeader}} is not returning “self” as leader
> but another one (ID=1), because of the difference in the quorum address.
> From the above log, it’s not an actual difference in terms of addresses but
> the {{self.getQuorumAddress()}} is returning an <unresolved> (even if it’s
> still the same hostname related to ZooKeeper-2 instance). This difference
> causes the allowedToCommit=false, meanwhile the ZooKeeper-2 is still reported
> as leader but it’s not able to commit, so prevents any requests and the
> ZooKeeper ensemble gets stuck.
> {code:java}
> 2023-06-19 12:32:51,996 WARN Suggested leader: 1
> (org.apache.zookeeper.server.quorum.QuorumPeer)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)]
> 2023-06-19 12:32:51,996 WARN This leader is not the designated leader, it
> will be initialized with allowedToCommit = false
> (org.apache.zookeeper.server.quorum.Leader)
> [QuorumPeer[myid=3](plain=127.0.0.1:12181)(secure=0.0.0.0:2181)] {code}
> The overall issue could be related to DNS problems, with DNS records not
> registered yet during pod initialization (where ZooKeeper is running on
> Kubernetes). But we don’t understand why it’s not able to recover somehow.
> What we don't get a reason is why ZooKeeper 3.6.3 didn't need any binding
> specific configuration and was working just fine, while the new 3.6.4 needs
> that. Other than that, why we see the above described issue after the binding
> works fine.
> We also tried by using latest ZooKeeper 3.8.1 facing the same issue.
> It's worth mentioning that the issue wasn't reproducible in environment like
> AKS or OpenShift, but if it's somehow slow DNS related, it could happen
> anyway (even not on minikube as in our use case).
> FYI, the discussion we had in the upstream Strimzi community
> [https://github.com/strimzi/strimzi-kafka-operator/issues/8675]
> There is also a thread on the ZooKeeper users mailing list
> [https://lists.apache.org/thread/vsff39wpsrxb431hdw9db4np12vytdpm]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)