[
https://issues.apache.org/jira/browse/HDFS-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481989#comment-14481989
]
Kihwal Lee commented on HDFS-8068:
----------------------------------
{{FailoverOnNetworkExceptionRetry#shouldRetry()}} thinks
{{UnknownHostException}} is retriable, but it's not in the current form. If we
are to support transparent retry and recovery, there has to be a way to tell
the failover proxy provider to abandon underlying broken proxy and recreate on
reception of {{UnknownHostException}}. This can be a bit ugly. An alternative
way is to have the failover proxy provider check the address of the existing
proxy object in {{getProxy()}} and recreate if it is bad. The newly created one
may still be broken causing calls to throw {{UnklnowHostException}}, but if DNS
recovers, it will eventually succeed after some number of retries(recreations).
But it has a drawback of having to be fixed in the pluggable failover proxy
provider level. Is namenode HA also supposed to cover infrastructure outages?
If not, the v1 patch should be sufficient.
> Do not retry rpc calls If the proxy contains unresolved address
> ---------------------------------------------------------------
>
> Key: HDFS-8068
> URL: https://issues.apache.org/jira/browse/HDFS-8068
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Attachments: HDFS-8068.v1.patch
>
>
> When the InetSocketAddress object happens to be unresolvable (e.g. due to
> transient DNS issue), the rpc proxy object will not be usable since the
> client will throw UnknownHostException when a Connection object is created.
> If FailoverOnNetworkExceptionRetry is used as in the standard HA failover
> proxy, the call will be retried, but this will never recover. Instead, the
> validity of address must be checked on pxoy creation and throw if it is
> invalid.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)