[ https://issues.apache.org/jira/browse/ZOOKEEPER-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553071#comment-16553071 ]
Andor Molnar commented on ZOOKEEPER-3100: ----------------------------------------- Btw. Regarding the embedded ZooKeeper: this is might not a bug exactly, because you should follow the same configuration convention on both sides: e.g. if you configure your server to listen on 127.0.0.1, you should set up your client exactly the same way. > ZooKeeper client times out due to random choice of resolved addresses > --------------------------------------------------------------------- > > Key: ZOOKEEPER-3100 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3100 > Project: ZooKeeper > Issue Type: Bug > Components: java client > Affects Versions: 3.4.13 > Reporter: Rajini Sivaram > Assignee: Andor Molnar > Priority: Major > > The changes to ZooKeeper clients to re-resolve hosts made under > ZOOKEEPER-2184 results in delays when only a subset of the addresses that a > host resolves to are actually reachable. This can result in connection > timeouts on the client. > For example, when running tests with a single ZooKeeper server accepting > connections on 127.0.0.1 on a host that has both IPv4 and IPv6, we have seen > connection timeouts in tests if client connects using `localhost` rather than > `127.0.0.1`. ZooKeeper client resolves `localhost` to both the IPv4 and IPv6 > addresses and chooses a random one. If IPv6 was chosen, a fixed one second > backoff is applied before retry since there is only one hostname specified. > After backoff, 'localhost' is resolved again and a random address chosen, > which could also be the unconnectable IPv6 address. > For the list of host names specified for connection, the clients do > round-robin without backoffs until connections to all hostnames are > attempted. Can we also do the same for addresses that each of the hosts > resolves to, so that backoffs are only applied after connection to each > address is attempted once and every address is connected to once using > round-robin rather than random selection? This will avoid delays in cases > where at least one address can be connected to. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)