[ https://issues.apache.org/jira/browse/HDFS-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liuguanghua updated HDFS-17048: ------------------------------- Component/s: hdfs > FSNamesystem.delete() maybe cause data residue when active namenode crash or > shutdown > --------------------------------------------------------------------------------------- > > Key: HDFS-17048 > URL: https://issues.apache.org/jira/browse/HDFS-17048 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Environment: > > Reporter: liuguanghua > Priority: Major > > Consider the following scenario: > (1) User delete a hdfs dir with many blocks. > (2) Then ative Namenode is crash or shutdown or failover to standby Namenode > by administrator > (3) This may result in residual data > > FSNamesystem.delete() will > (1)delete dir first > (2)add toRemovedBlocks into markedDeleteQueue. > (3) MarkedDeleteBlockScrubber Thread will consumer the markedDeleteQueue and > delete blocks. > If the active namenode crash, the blocks in markedDeleteQueue will be lost > and never be deleted. And the block cloud not find via hdfs fsck command. But > it is alive in datanode disk. > > Thus , > SummaryA = hdfs dfs -du -s / > SummaryB =sum( datanode report dfsused) > SummaryA < SummaryB > > This may be unavoidable. But is there any way to find out the blocks that > should be deleted and clean it ? > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org