[ 
https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188501#comment-15188501
 ] 

Inigo Goiri commented on HDFS-9901:
-----------------------------------

The first version of the patch:
# Makes {{DF}} asynchronous when monitoring the disk by creating a thread that 
checks the disk and updates the disk status periodically. Then the 
{{FsVolumeImpl}} reads the values that are collected asynchronously.
# Makes the checks (which required disk accesses) in {{transferBlock()}} in 
{{DataNode}} into a separate thread so the heartbeat does not have to wait for 
this when heartbeating.

> Move block validation out of the heartbeat thread
> -------------------------------------------------
>
>                 Key: HDFS-9901
>                 URL: https://issues.apache.org/jira/browse/HDFS-9901
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Hua Liu
>            Assignee: Hua Liu
>         Attachments: 
> 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, 
> which checks the existence and length of a block before spins off a thread to 
> do the actual transferring. In extreme cases, the heartbeat thread hang more 
> than 10 minutes so the namenode marked the datanode as dead and started 
> replicating its blocks, which caused more disk IO on other nodes and can 
> potentially brought them down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to