liuguanghua created HDFS-17048: ---------------------------------- Summary: FSNamesystem.delete() maybe cause data residue when active namenode crash or shutdown Key: HDFS-17048 URL: https://issues.apache.org/jira/browse/HDFS-17048 Project: Hadoop HDFS Issue Type: Bug Environment:
Reporter: liuguanghua Consider the following scenario: (1) User delete a hdfs dir with many blocks. (2) Then ative Namenode is crash or shutdown or failover to standby Namenode by administrator (3) This may result in residual data FSNamesystem.delete() will (1)delete dir first (2)add toRemovedBlocks into markedDeleteQueue. (3) MarkedDeleteBlockScrubber Thread will consumer the markedDeleteQueue and delete blocks. If the active namenode crash, the blocks in markedDeleteQueue will be lost and never be deleted. And the block cloud not find via hdfs fsck command. But it is alive in datanode disk. Thus , SummaryA = hdfs dfs -du -s / SummaryB =sum( datanode report dfsused) SummaryA < SummaryB This may be unavoidable. But is there any tools to find out the blocks that should be delted and clean it ? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org