[ https://issues.apache.org/jira/browse/HDFS-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gordon Wang updated HDFS-6636: ------------------------------ Description: In our test environment, we found the namenode can not handle incremental block report correctly when the block replica is under construction and the replica is marked as corrupt. Here is our scenario. * the block had 3 replica by default. But because one datanode was down, the available replica for the block was 2. Say the alive datanode is DN1 and DN2. * client tried to append data to the block. And during appending, something was wrong with the pipeline. Then, client did the pipeline recovery, only one datanode DN1 is in the pipeline now. * For some unknown reason(might be the IO error), DN2 got checksum error when receiving block data from DN1, then DN2 reported the replica on DN1 as bad block to NameNode. But actually, client was appending data to replica on DN1, and the replica is good. * NameNode marked replica on DN1 as corrupt. * When client finished appending, DN1 checked the data in the replica, and the replica is OK. Then, DN1 finalized the replica, DN1 reported the block as received block to NameNode. * NameNode handled the incremental block report form DN1, because the block is under construction. NameNode called the addStoredBlockUnderConstruction in block manager. But as the replica on DN1 was never removed from the corrupted block. The number of alive replica for the block was 0, and the number of corrupt replica was 1. * client could not complete the file because the number of alive replicas for the last block was smaller than minimal replica number. was: In our test environment, we found the namenode can not handle incremental block report correctly when the block replica is under construction and the replica is marked as corrupt. Here is our scenario. *the block had 3 replica by default. But because one datanode was down, the available replica for the block was 2. Say the alive datanode is DN1 and DN2. *client tried to append data to the block. And during appending, something was wrong with the pipeline. Then, client did the pipeline recovery, only one datanode DN1 is in the pipeline now. *For some unknown reason(might be the IO error), DN2 got checksum error when receiving block data from DN1, then DN2 reported the replica on DN1 as bad block to NameNode. But actually, client was appending data to replica on DN1, and the replica is good. *NameNode marked replica on DN1 as corrupt. *When client finished appending, DN1 checked the data in the replica, and the replica is OK. Then, DN1 finalized the replica, DN1 reported the block as received block to NameNode. *NameNode handled the incremental block report form DN1, because the block is under construction. NameNode called the addStoredBlockUnderConstruction in block manager. But as the replica on DN1 was never removed from the corrupted block. The number of alive replica for the block was 0, and the number of corrupt replica was 1. *client could not complete the file because the number of alive replicas for the last block was smaller than minimal replica number. > NameNode should remove block replica out from corrupted replica map when > adding block under construction > -------------------------------------------------------------------------------------------------------- > > Key: HDFS-6636 > URL: https://issues.apache.org/jira/browse/HDFS-6636 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.2.0 > Reporter: Gordon Wang > > In our test environment, we found the namenode can not handle incremental > block report correctly when the block replica is under construction and the > replica is marked as corrupt. > Here is our scenario. > * the block had 3 replica by default. But because one datanode was down, the > available replica for the block was 2. Say the alive datanode is DN1 and DN2. > * client tried to append data to the block. And during appending, something > was wrong with the pipeline. Then, client did the pipeline recovery, only one > datanode DN1 is in the pipeline now. > * For some unknown reason(might be the IO error), DN2 got checksum error when > receiving block data from DN1, then DN2 reported the replica on DN1 as bad > block to NameNode. But actually, client was appending data to replica on DN1, > and the replica is good. > * NameNode marked replica on DN1 as corrupt. > * When client finished appending, DN1 checked the data in the replica, and > the replica is OK. Then, DN1 finalized the replica, DN1 reported the block as > received block to NameNode. > * NameNode handled the incremental block report form DN1, because the block > is under construction. NameNode called the addStoredBlockUnderConstruction in > block manager. But as the replica on DN1 was never removed from the corrupted > block. The number of alive replica for the block was 0, and the number of > corrupt replica was 1. > * client could not complete the file because the number of alive replicas for > the last block was smaller than minimal replica number. > -- This message was sent by Atlassian JIRA (v6.2#6252)