[ https://issues.apache.org/jira/browse/HDFS-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996688#comment-13996688 ]
Suresh Srinivas commented on HDFS-6186: --------------------------------------- +1 for the patch. We should create another jira to print along with current pending deletion count displayed on namenode webUI, something like: {{Pending Deletion Blocks: xxxx (Block deletion will resume after xxx seconds)}} > Pause deletion of blocks when the namenode starts up > ---------------------------------------------------- > > Key: HDFS-6186 > URL: https://issues.apache.org/jira/browse/HDFS-6186 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Reporter: Suresh Srinivas > Assignee: Jing Zhao > Attachments: HDFS-6186.000.patch, HDFS-6186.002.patch, > HDFS-6186.003.patch > > > HDFS namenode can delete blocks very quickly, given the deletion happens as a > parallel operation spread across many datanodes. One of the frequent > anxieties I see is that a lot of data can be deleted very quickly, when a > cluster is brought up, especially when one of the storage directories has > failed and namenode metadata was copied from another storage. Copying wrong > metadata would results in some of the newer files (if old metadata was > copied) being deleted along with their blocks. > HDFS-5986 now captures the number of pending deletion block on namenode webUI > and JMX. I propose pausing deletion of blocks for a configured period of time > (default 1 hour?) after namenode comes out of safemode. This will give enough > time for the administrator to notice large number of pending deletion blocks > and take corrective action. > Thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)