[ https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822410#comment-13822410 ]
Hudson commented on HDFS-5504: ------------------------------ FAILURE: Integrated in Hadoop-Mapreduce-trunk #1608 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1608/]) HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode. Contributed by Vinay. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1541773) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java > In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, > leads to NN safemode. > ------------------------------------------------------------------------------------------------ > > Key: HDFS-5504 > URL: https://issues.apache.org/jira/browse/HDFS-5504 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots > Affects Versions: 3.0.0, 2.2.0 > Reporter: Vinay > Assignee: Vinay > Fix For: 2.3.0 > > Attachments: HDFS-5504.patch, HDFS-5504.patch > > > 1. HA installation, standby NN is down. > 2. delete snapshot is called and it has deleted the blocks from blocksmap and > all datanodes. log sync also happened. > 3. before next log roll NN crashed > 4. When the namenode restartes then it will fsimage and finalized edits from > shared storage and set the safemode threshold. which includes blocks from > deleted snapshot also. (because this edits is not yet read as namenode is > restarted before the last edits segment is not finalized) > 5. When it becomes active, it finalizes the edits and read the delete > snapshot edits_op. but at this time, it was not reducing the safemode count. > and it will continuing in safemode. > 6. On next restart, as the edits is already finalized, on startup only it > will read and set the safemode threshold correctly. > But one more restart will bring NN out of safemode. -- This message was sent by Atlassian JIRA (v6.1#6144)