[ https://issues.apache.org/jira/browse/ZOOKEEPER-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559776#comment-16559776 ]
Andor Molnar commented on ZOOKEEPER-3100: ----------------------------------------- [~rsivaram] I've run a few tests with current 3.4 and 3.5 versions of ZooKeeper and I got the same results: I spoke a little bit soon regarding the wildcard address, because ZooKeeper opens a unified socket this way. Although netstat shows that Zk is listening only on v6 socket, clients are able to connect with both protocols: {noformat} andor@andor-centos zkconf]$ sudo netstat -plnt | grep 2181 tcp6 0 0 :::2181 :::* LISTEN 9249/java [andor@andor-centos zkconf]$ echo "stat" | nc -4 -v localhost 2181 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to 127.0.0.1:2181. stat is not executed because it is not in the whitelist. Ncat: 5 bytes sent, 57 bytes received in 0.01 seconds. [andor@andor-centos zkconf]$ echo "stat" | nc -6 -v localhost 2181 Ncat: Version 7.50 ( https://nmap.org/ncat ) Ncat: Connected to ::1:2181. stat is not executed because it is not in the whitelist. Ncat: 5 bytes sent, 57 bytes received in 0.01 seconds.{noformat} So back to your original issue, I'm not able to repro it. CLI also works perfectly for me. I need to look into the Kafka ticket, it must be something specific to that client. > ZooKeeper client times out due to random choice of resolved addresses > --------------------------------------------------------------------- > > Key: ZOOKEEPER-3100 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3100 > Project: ZooKeeper > Issue Type: Bug > Components: java client > Affects Versions: 3.4.13 > Reporter: Rajini Sivaram > Assignee: Andor Molnar > Priority: Major > > The changes to ZooKeeper clients to re-resolve hosts made under > ZOOKEEPER-2184 results in delays when only a subset of the addresses that a > host resolves to are actually reachable. This can result in connection > timeouts on the client. > For example, when running tests with a single ZooKeeper server accepting > connections on 127.0.0.1 on a host that has both IPv4 and IPv6, we have seen > connection timeouts in tests if client connects using `localhost` rather than > `127.0.0.1`. ZooKeeper client resolves `localhost` to both the IPv4 and IPv6 > addresses and chooses a random one. If IPv6 was chosen, a fixed one second > backoff is applied before retry since there is only one hostname specified. > After backoff, 'localhost' is resolved again and a random address chosen, > which could also be the unconnectable IPv6 address. > For the list of host names specified for connection, the clients do > round-robin without backoffs until connections to all hostnames are > attempted. Can we also do the same for addresses that each of the hosts > resolves to, so that backoffs are only applied after connection to each > address is attempted once and every address is connected to once using > round-robin rather than random selection? This will avoid delays in cases > where at least one address can be connected to. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)