[ https://issues.apache.org/jira/browse/HDFS-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716489#comment-17716489 ]
ASF GitHub Bot commented on HDFS-16985: --------------------------------------- smarthanwang commented on PR #5564: URL: https://github.com/apache/hadoop/pull/5564#issuecomment-1522670726 @Hexiaoqiao the failed UT seems not related to this PR,please help check. > delete local block file when FileNotFoundException occurred may lead to > missing block. > -------------------------------------------------------------------------------------- > > Key: HDFS-16985 > URL: https://issues.apache.org/jira/browse/HDFS-16985 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Chengwei Wang > Assignee: Chengwei Wang > Priority: Major > Labels: pull-request-available > > We encounterd several missing-block problem in our production cluster which > hdfs running on AWS EC2 + EBS. > The root cause: > # the block remains only 1 replication left and hasn't been reconstruction > # DN checks block file existing when BlockSender construction > # the EBS checking failed and throw FileNotFoundException (EBS may be in > fault condition) > # DN invalidateBlock and schedule block async deletion > # EBS already back to normal when DN do delete block > # the block file be delete permanently and can't be recovered -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org