[ https://issues.apache.org/jira/browse/HDFS-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251420#comment-14251420 ]
Frantisek Vacek commented on HDFS-7392: --------------------------------------- Problem can be solved by different implementation of SecurityUtils.StandardHostResolver.getByName(String host) Current implementation {code} interface HostResolver { InetAddress getByName(String host) throws UnknownHostException; } /** * Uses standard java host resolution */ static class StandardHostResolver implements HostResolver { @Override public InetAddress getByName(String host) throws UnknownHostException { return InetAddress.getByName(host); } } {code} Proper implementation should be like {code} interface HostResolver { InetAddress[] getByName(String host) throws UnknownHostException; } /** * Uses standard java host resolution */ static class StandardHostResolver implements HostResolver { @Override public InetAddress[] getByName(String host) throws UnknownHostException { return InetAddress.getAllByName(host); } } {code} > org.apache.hadoop.hdfs.DistributedFileSystem open invalid URI forever > --------------------------------------------------------------------- > > Key: HDFS-7392 > URL: https://issues.apache.org/jira/browse/HDFS-7392 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Reporter: Frantisek Vacek > Assignee: Yi Liu > Attachments: 1.png, 2.png > > > In some specific circumstances, > org.apache.hadoop.hdfs.DistributedFileSystem.open(invalid URI) never timeouts > and last forever. > What are specific circumstances: > 1) HDFS URI (hdfs://share.example.com:8020/someDir/someFile.txt) should point > to valid IP address but without name node service running on it. > 2) There should be at least 2 IP addresses for such a URI. See output below: > {quote} > [~/proj/quickbox]$ nslookup share.example.com > Server: 127.0.1.1 > Address: 127.0.1.1#53 > share.example.com canonical name = > internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com. > Name: internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com > Address: 192.168.1.223 > Name: internal-realm-share-example-com-1234.us-east-1.elb.amazonaws.com > Address: 192.168.1.65 > {quote} > In such a case the org.apache.hadoop.ipc.Client.Connection.updateAddress() > returns sometimes true (even if address didn't actually changed see img. 1) > and the timeoutFailures counter is set to 0 (see img. 2). The > maxRetriesOnSocketTimeouts (45) is never reached and connection attempt is > repeated forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)