[
https://issues.apache.org/jira/browse/ZOOKEEPER-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369441#comment-16369441
]
Eron Wright commented on ZOOKEEPER-2982:
-----------------------------------------
Looking at `Learner` in 3.4 versus 3.5, a necessary call within `findLeader` to
`recreateSocketAddresses` was added in 3.4 but not ported to 3.5. The
suggested fix is to add the call.
Note that other elements of ZOOKEEPR-1506 were already ported:
https://github.com/apache/zookeeper/commit/d2a49163b7bc7c9589140dbba7f60e591028f908
> Re-try DNS hostname -> IP resolution
> ------------------------------------
>
> Key: ZOOKEEPER-2982
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2982
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.0, 3.5.1, 3.5.3
> Reporter: Eron Wright
> Priority: Blocker
> Fix For: 3.5.4
>
>
> ZOOKEEPER-1506 fixed a DNS resolution issue in 3.4. Some portions of the fix
> haven't yet been ported to 3.5.
> To recap the outstanding problem in 3.5, if a given ZK server is started
> before all peer addresses are resolvable, that server may cache a negative
> lookup result and forever fail to resolve the address. For example,
> deploying ZK 3.5 to Kubernetes using a StatefulSet plus a Service (headless)
> may fail because the DNS records are created lazily.
> {code}
> 2018-02-18 09:11:22,583 [myid:0] - WARN
> [QuorumPeer[myid=0](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):Follower@95]
> - Exception when following the leader
> java.net.UnknownHostException: zk-2.zk.default.svc.cluster.local
> at
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at
> org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:227)
> at
> org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:256)
> at
> org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:76)
> at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)