[ https://issues.apache.org/jira/browse/HDFS-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427063#comment-13427063 ]
ji...@taobao.com commented on HDFS-3729: ---------------------------------------- cdh3u3 and below > when datanode is blocked in BlockReceiver.receiveBlock(...) by disk > error/pressure, DFSClient is blocked and no timeout mechanism > --------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-3729 > URL: https://issues.apache.org/jira/browse/HDFS-3729 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Reporter: ji...@taobao.com > > Our hadoop/hbase in taobao.com cluster is blocked by DFSClient somtimes. The > reason is disk error or too much load, but HEART_BEAT in PacketResponder is > normal, so DFSClient wait forever until method of disk read/write return. I > searched issues in jira, and nothing for this issue, we plan to do some work > to fix this bug. The initial idea is to add timeout mechanism for the > DFSClient write function. Does some guys have comments about this? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira