[ 
https://issues.apache.org/jira/browse/HDFS-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13551602#comment-13551602
 ] 

Todd Lipcon commented on HDFS-4389:
-----------------------------------

Hey Daryn. I vaguely remember this being a conscious decision at some point, 
but maybe I made that up. Two thoughts that might be relevant:

1) TestPersistBlocks seems to have started failing much more often recently, 
but I don't have evidence for this. Any chance something else might have caused 
a regression here?

2) In the old code, which retried over the restart, wouldn't it end up just 
hitting a SafeModeException and then failing at that point, when the NN was 
restarted? Given that the NN usually takes 30+seconds to leave safemode after 
starting, any retrying clients would probably hit that and fail anyway, no?
                
> Non-HA DFSClients do not attempt reconnects
> -------------------------------------------
>
>                 Key: HDFS-4389
>                 URL: https://issues.apache.org/jira/browse/HDFS-4389
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, hdfs-client
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Priority: Critical
>
> The HA retry policy implementation appears to have broken non-HA 
> {{DFSClient}} connect retries.  The ipc 
> {{Client.Connection#handleConnectionFailure}} used to perform 45 connection 
> attempts, but now it consults a retry policy.  For non-HA proxies, the policy 
> does not handle {{ConnectException}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to