Inigo Goiri created HDFS-9910:
---------------------------------
Summary: Datanode heartbeats can get blocked by disk in
{{FsDatasetImpl#checkBlock()}}
Key: HDFS-9910
URL: https://issues.apache.org/jira/browse/HDFS-9910
Project: Hadoop HDFS
Issue Type: Bug
Components: datanode
Affects Versions: 2.7.2
Reporter: Inigo Goiri
Assignee: Hua Liu
When a data node needs to transfer a block, it validates the block in the
heartbeat thread invoking the {{checkBlock()}} method of {{FsDatasetImpl}},
where it checks whether the block exists and gets the block length. If the
block is valid, it then spins off a thread to do the actual block transfer. We
found that during heavy disk IO the heartbeat thread hangs on
{{replicaInfo.getBlockFile().exists()}} for more than 10 minutes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)