[ https://issues.apache.org/jira/browse/ZOOKEEPER-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046399#comment-14046399 ]
Hadoop QA commented on ZOOKEEPER-1576: -------------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645651/ZOOKEEPER-1576.patch against trunk revision 1605517. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2159//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2159//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2159//console This message is automatically generated. > Zookeeper cluster - failed to connect to cluster if one of the provided IPs > causes java.net.UnknownHostException > ---------------------------------------------------------------------------------------------------------------- > > Key: ZOOKEEPER-1576 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1576 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.5.0 > Environment: Three 3.4.3 zookeeper servers in cluster, linux. > Reporter: Tally Tsabary > Assignee: Edward Ribeiro > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1576-3.4.patch, ZOOKEEPER-1576.3.patch, > ZOOKEEPER-1576.4.patch, ZOOKEEPER-1576.5.patch, ZOOKEEPER-1576.patch > > > Using a cluster of three 3.4.3 zookeeper servers. > All the servers are up, but on the client machine, the firewall is blocking > one of the servers. > The following exception is happening, and the client is not connected to any > of the other cluster members. > The exception:Nov 02, 2012 9:54:32 PM > com.netflix.curator.framework.imps.CuratorFrameworkImpl logError > SEVERE: Background exception was not retry-able or retry gave up > java.net.UnknownHostException: scnrmq003.myworkday.com > at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) > at java.net.InetAddress$1.lookupAllHostAddr(Unknown Source) > at java.net.InetAddress.getAddressesFromNameService(Unknown Source) > at java.net.InetAddress.getAllByName0(Unknown Source) > at java.net.InetAddress.getAllByName(Unknown Source) > at java.net.InetAddress.getAllByName(Unknown Source) > at > org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60) > at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:440) > at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:375) > The code at the > org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60) > is : > public StaticHostProvider(Collection<InetSocketAddress> serverAddresses) > throws UnknownHostException { > for (InetSocketAddress address : serverAddresses) { > InetAddress resolvedAddresses[] = InetAddress.getAllByName(address > .getHostName()); > for (InetAddress resolvedAddress : resolvedAddresses) { > this.serverAddresses.add(new InetSocketAddress(resolvedAddress > .getHostAddress(), address.getPort())); } > } > ...... > The for-loop is not trying to resolve the rest of the servers on the list if > there is an UnknownHostException at the > InetAddress.getAllByName(address.getHostName()); > and it fails the client connection creation. > I was expecting the connection will be created for the other members of the > cluster. > Also, InetAddress is a blocking command, and if it takes very long time, > (longer than the defined timeout) - that also should allow us to continue to > try and connect to the other servers on the list. > Assuming this will be fixed, and we will get connection to the current > available servers, I think the zookeeper should continue to retry to connect > to the not-connected server of the cluster, so it will be able to use it > later when it is back. > If one of the servers on the list is not available during the connection > creation, then it should be retried every x time despite the fact that we -- This message was sent by Atlassian JIRA (v6.2#6252)