[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379522#comment-15379522
 ] 

Bob Hansen commented on HDFS-10441:
-----------------------------------

Just a _few_ more questions:
* In RpcConnectionImpl<NextLayer>::OnRecvCompleted, if we detect that we've 
connected to the standby, it falls through to StartReading().  Should it bail 
out at that point?
* In RpcEngine::RpcCommsError, we call 
pendingRequests[i]->IncrementFailoverCount();  should that implicitly reset the 
retry count to 0?  Will we get into cases where it retries until it fails, then 
the retry count is already == max_retry?
* If a namenode is down when we try to resolve, we don't try again when it's 
time to fail over, do we?  We should capture that in another bug

For discussion, not necessarily to fix in this patch:
* In FixedDealyWithFailover::ShouldRetry(), should we failover on any other 
errors other than timeout?  Bad route to host?  DNS failure?
* In FixedDealyWithFailover::ShouldRetry(), we're always using a delay if 
retries < 3.  This should be configurable.  We can cover that in another bug





> libhdfs++: HA namenode support
> ------------------------------
>
>                 Key: HDFS-10441
>                 URL: https://issues.apache.org/jira/browse/HDFS-10441
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-10441.HDFS-8707.013.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to