[ https://issues.apache.org/jira/browse/HDFS-15605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203775#comment-17203775 ]
Jinglun commented on HDFS-15605: -------------------------------- Hi [~leosun08], would you like to help reviewing this, thanks ! > DeadNodeDetector supports getting deadnode from NameNode. > --------------------------------------------------------- > > Key: HDFS-15605 > URL: https://issues.apache.org/jira/browse/HDFS-15605 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Jinglun > Assignee: Jinglun > Priority: Major > Attachments: HDFS-15605.001.patch > > > When we are using DeadNodeDetector, sometimes it marks too many nodes as dead > and cause the read failures. The DeadNodeDetector assumes all the > getDatanodeInfo rpcs failed to return in time are dead nodes. But actually > not. A client side error or a slow rpc in DataNode might be marked as dead > too. For example the client side delay of the rpcThreadPool might cause the > getDatanodeInfo rpcs timeout and adding many datanodes to the dead list. > We have a simple improvement for this: the NameNode already knows which > datanodes are dead. So just update the dead list from NameNode using > DFSClient.datanodeReport(). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org