Github user revans2 commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/648#discussion_r221376870 --- Diff: src/java/main/org/apache/zookeeper/ClientCnxn.java --- @@ -990,6 +992,27 @@ private void sendPing() { private boolean saslLoginFailed = false; private void startConnect(InetSocketAddress addr) throws IOException { + boolean canonicalize = true; + try { + canonicalize = Boolean.parseBoolean(System.getProperty(ZK_SASL_CLIENT_CANONICALIZE_HOSTNAME, "true")); + } catch (IllegalArgumentException ea) { + //ignored ... + } + + if (canonicalize) { + try { + InetAddress ia = addr.getAddress(); + LOG.warn("ia {}", ia); + if (ia == null) { + ia = InetAddress.getByName(addr.getHostName()); + } + String host = (ia != null) ? ia.getCanonicalHostName() : addr.getHostName(); + addr = new InetSocketAddress(InetAddress.getByAddress(host, ia.getAddress()), addr.getPort()); --- End diff -- I put it here because there is a possible race. The entire reason we have the setup we do is so that we can change the nodes in the cluster without changing the config on the client side. If the code that establishes the connection uses a different address from the code that creates the principal there is the possibility, because of the magic of DNS caching, that the connection would be made to a different box from the one the SASL client is expecting. If that happens, and we have the default Client section in the jaas conf, it will not result in an error. Instead the ZK client logs something about SASL failed and still connects but is not authorized to do anything. If it failed and retried a different node I would be happy to move it, but it does not and that failure would not be transparent to the end user.
---