Rhys Yarranton created ZOOKEEPER-3825:
-----------------------------------------
Summary: StaticHostProvider.updateServerList address matching
fails when connectString uses IP addresses
Key: ZOOKEEPER-3825
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3825
Project: ZooKeeper
Issue Type: Bug
Components: java client
Affects Versions: 3.5.5
Reporter: Rhys Yarranton
StaticHostProvider.updateServerList contains address matching like this:
{code:java}
for (InetSocketAddress addr : shuffledList) {
if (addr.getPort() == myServer.getPort()
&& ((addr.getAddress() != null
&& myServer.getAddress() != null && addr
.getAddress().equals(myServer.getAddress())) || addr
.getHostString().equals(myServer.getHostString())))
{
myServerInNewConfig = true;
break;
}
}
{code}
The addresses in shuffledList are unresolved, while the current server address
in myServer is a resolved address (coming from a socket). If the connect
string is expressed in terms of IP addresses instead of host names, the two
won't match even when they represent the same server.
On the unresolved addresses, getAddress() is null, and getHostString() is
something like 1.2.3.4. On the resolved address, getAddress() is not null, and
getHostString() is (normally) the canonical host name corresponding to the IP
address.
As a result, this method tends to return true (reconfig) when it should not.
The calling method, ZooKeeper.updateServerList then closes the connection.
This might be written off as not too serious, except that Curator calls this
method when there is a connection state change. (Sometimes many times.) What
we observe is that when the client has to reconnect, _e.g._, if there is a
server failure, when it reconnects the socket gets closed right away. It goes
into a cycle of death until the session dies and a new one is created. (This
doesn't seem like very nice behaviour on Curator's behalf, but that's what's
out there.)
As a workaround, we implemented a custom HostProvider to filter out calls to
updateServerList which don't actually change the list.
As a permanent fix, instead of passing the current host based on the socket
remote address, may need to remember the unresolved address that was used to
connect. (Or use the original strings.)
Filed this against 3.5.5. Based on source control, it looks this still in
exists on master at time of writing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)