[ 
https://issues.apache.org/jira/browse/HDFS-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978253#comment-16978253
 ] 

Lisheng Sun commented on HDFS-14651:
------------------------------------

[~linyiqun]

 when client read failed from Datanode, there are generally two reasons as 
follow:

1.There is a problem with the datanode itself.

2.There is a problem with the replica on datanode and datanode is good.

The client can't distinguish between the two cases. 
For the second case, we should not add the datanode to dead list. it need to be 
confirmed by re-probing and requires a higher priority processing. so we add 
re-probing node to suspicious list. At the same time the datanode in suspicious 
list is accessed from other dfsinputstream.

> DeadNodeDetector checks dead node periodically
> ----------------------------------------------
>
>                 Key: HDFS-14651
>                 URL: https://issues.apache.org/jira/browse/HDFS-14651
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14651.001.patch, HDFS-14651.002.patch, 
> HDFS-14651.003.patch, HDFS-14651.004.patch, HDFS-14651.005.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to