[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838516#comment-13838516
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1576:
---------------------------------------------------

Sure [~fournc] - but tbh requesting minimum formatting diligence (i.e.: 
spelling, indentation and consistency) on every + line (not the context ones) 
isn't trivial formatting but just basic sanity. Sadly, many patches don't have 
it which makes it very hard for people testing patches that haven't been 
committed yet, testing trunk, etc. Being able to read through those — and 
related — patches fast  and without losing focus because of inconsistent style 
really helps when testing stuff.

It would certainly have helped me when finding bugs like ZOOKEEPER-1805 and 
ZOOKEEPER-1807 for which I had to read through a lot of code with mixed styles 
or no style at all :(

> Zookeeper cluster - failed to connect to cluster if one of the provided IPs 
> causes java.net.UnknownHostException
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1576
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1576
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.0
>         Environment: Three 3.4.3 zookeeper servers in cluster, linux.
>            Reporter: Tally Tsabary
>            Assignee: Edward Ribeiro
>             Fix For: 3.5.0
>
>         Attachments: ZOOKEEPER-1576.3.patch, ZOOKEEPER-1576.4.patch, 
> ZOOKEEPER-1576.5.patch
>
>
> Using a cluster of three 3.4.3 zookeeper servers.
> All the servers are up, but on the client machine, the firewall is blocking 
> one of the  servers.
> The following exception is happening, and the client is not connected to any 
> of the other cluster members.
> The exception:Nov 02, 2012 9:54:32 PM 
> com.netflix.curator.framework.imps.CuratorFrameworkImpl logError
> SEVERE: Background exception was not retry-able or retry gave up
> java.net.UnknownHostException: scnrmq003.myworkday.com
> at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
> at java.net.InetAddress$1.lookupAllHostAddr(Unknown Source)
> at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
> at java.net.InetAddress.getAllByName0(Unknown Source)
> at java.net.InetAddress.getAllByName(Unknown Source)
> at java.net.InetAddress.getAllByName(Unknown Source)
> at 
> org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60)
> at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:440)
> at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:375)
> The code at the 
> org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60)
>  is :
> public StaticHostProvider(Collection<InetSocketAddress> serverAddresses) 
> throws UnknownHostException {
> for (InetSocketAddress address : serverAddresses) {
> InetAddress resolvedAddresses[] = InetAddress.getAllByName(address
> .getHostName());
> for (InetAddress resolvedAddress : resolvedAddresses) { 
> this.serverAddresses.add(new InetSocketAddress(resolvedAddress 
> .getHostAddress(), address.getPort())); }
> }
> ......
> The for-loop is not trying to resolve the rest of the servers on the list if 
> there is an UnknownHostException at the 
> InetAddress.getAllByName(address.getHostName()); 
> and it fails the client connection creation.
> I was expecting the connection will be created for the other members of the 
> cluster. 
> Also, InetAddress is a blocking command, and if it takes very long time,  
> (longer than the defined timeout) - that also should allow us to continue to 
> try and connect to the other servers on the list.
> Assuming this will be fixed, and we will get connection to the current 
> available servers, I think the zookeeper should continue to retry to connect 
> to the not-connected server of the cluster, so it will be able to use it 
> later when it is back.
> If one of the servers on the list is not available during the connection 
> creation, then it should be retried every x time despite the fact that we 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to