[ https://issues.apache.org/jira/browse/HDFS-6908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310628#comment-14310628 ]
Abhishek Rai commented on HDFS-6908: ------------------------------------ Thanks Harsh, that sounds reasonable. It gives us a way to avoid having to live with the FSImageFormatPBSnapshot hack longer term once the proper fix for this bug is applied. Thanks > incorrect snapshot directory diff generated by snapshot deletion > ---------------------------------------------------------------- > > Key: HDFS-6908 > URL: https://issues.apache.org/jira/browse/HDFS-6908 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots > Reporter: Juan Yu > Assignee: Juan Yu > Priority: Critical > Fix For: 2.6.0 > > Attachments: HDFS-6908.001.patch, HDFS-6908.002.patch, > HDFS-6908.003.patch > > > In the following scenario, delete snapshot could generate incorrect snapshot > directory diff and corrupted fsimage, if you restart NN after that, you will > get NullPointerException. > 1. create a directory and create a file under it > 2. take a snapshot > 3. create another file under that directory > 4. take second snapshot > 5. delete both files and the directory > 6. delete second snapshot > incorrect directory diff will be generated. > Restart NN will throw NPE > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.addToDeletedList(FSImageFormatPBSnapshot.java:246) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDeletedList(FSImageFormatPBSnapshot.java:265) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadDirectoryDiffList(FSImageFormatPBSnapshot.java:328) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.FSImageFormatPBSnapshot$Loader.loadSnapshotDiffSection(FSImageFormatPBSnapshot.java:192) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:254) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:168) > at > org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:208) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:906) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:892) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:715) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:653) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:276) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:882) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:629) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:498) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:554) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)