[ 
https://issues.apache.org/jira/browse/HDFS-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481989#comment-14481989
 ] 

Kihwal Lee commented on HDFS-8068:
----------------------------------

{{FailoverOnNetworkExceptionRetry#shouldRetry()}} thinks 
{{UnknownHostException}} is retriable, but it's not in the current form. If we 
are to support transparent retry and recovery, there has to be a way to tell 
the failover proxy provider to abandon underlying broken proxy and recreate on 
reception of  {{UnknownHostException}}. This can be a bit ugly.  An alternative 
way is to have the failover proxy provider check the address of the existing 
proxy object in {{getProxy()}} and recreate if it is bad. The newly created one 
may still be broken causing calls to throw {{UnklnowHostException}}, but if DNS 
recovers, it will eventually succeed after some number of retries(recreations). 
But it has a drawback of having to be fixed in the pluggable failover proxy 
provider level.  Is namenode HA also supposed to cover infrastructure outages? 
If not, the v1 patch should be sufficient.

> Do not retry rpc calls If the proxy contains unresolved address
> ---------------------------------------------------------------
>
>                 Key: HDFS-8068
>                 URL: https://issues.apache.org/jira/browse/HDFS-8068
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-8068.v1.patch
>
>
> When the InetSocketAddress object happens to be unresolvable (e.g. due to 
> transient DNS issue), the rpc proxy object will not be usable since the 
> client will throw UnknownHostException when a Connection object is created. 
> If FailoverOnNetworkExceptionRetry is used as in the standard HA failover 
> proxy, the call will be retried, but this will never recover.  Instead, the 
> validity of address must be checked on pxoy creation and throw if it is 
> invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to