[ 
https://issues.apache.org/jira/browse/HDFS-17048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuguanghua updated HDFS-17048:
-------------------------------
    Component/s: hdfs

> FSNamesystem.delete() maybe cause data residue when active namenode crash  or 
> shutdown 
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-17048
>                 URL: https://issues.apache.org/jira/browse/HDFS-17048
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>         Environment:  
>  
>            Reporter: liuguanghua
>            Priority: Major
>
> Consider the following scenario:
> (1) User delete a hdfs dir with many blocks.
> (2) Then ative Namenode is crash or shutdown or failover to standby Namenode  
> by administrator
> (3) This may result in residual data
>  
> FSNamesystem.delete() will
> (1)delete dir first
> (2)add toRemovedBlocks into markedDeleteQueue. 
> (3) MarkedDeleteBlockScrubber Thread will consumer the markedDeleteQueue and 
> delete blocks.
> If the active namenode crash, the blocks in markedDeleteQueue will be lost 
> and never be deleted. And the block cloud not find via hdfs fsck command. But 
> it is alive in datanode disk.
>  
> Thus , 
> SummaryA =  hdfs dfs -du -s / 
> SummaryB =sum( datanode report dfsused)
> SummaryA < SummaryB
>  
> This may be unavoidable.  But is there any way to find out the blocks that 
> should be deleted and clean it ?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to