[ https://issues.apache.org/jira/browse/HDFS-15446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17149645#comment-17149645 ]
Stephen O'Donnell commented on HDFS-15446: ------------------------------------------ I am not sure if we need to call checkTraverse or not - the tests all seem to pass without it, so it seems like we don't need it in this use case, as permissions etc should all have been checked before the edit makes it into the edit stream. However, when the namenode receives a "createSnapshot" call it, eventually calls this code in FSDirSnapshotOp.java: {code} static String createSnapshot( FSDirectory fsd, FSPermissionChecker pc, SnapshotManager snapshotManager, String snapshotRoot, String snapshotName, boolean logRetryCache) throws IOException { final INodesInPath iip = fsd.resolvePath(pc, snapshotRoot, DirOp.WRITE); {code} As you can see, it calls fsd.resolvePath(...), and that is why my original patch changed to call this same method when loading the edits. I think it would be safer to use the same code in this change too, ie what was done in the 002 patch. [~ayushtkn] was concerned about the performance overhead, hence we created the cut down method in the 003 patch. However snapshot operations are relatively rare - maybe a few 1000's per day in an extreme case, and the cost of dropping a large snapshot is often several seconds of runtime. The cost of getting the IIP should be small compared to that. It also seems the original processing code was calling checkTraverse anyway, so we already had that overhead. [~ayushtkn] What do you think? Is there any risk from not calling checkTraverse(...) ? > CreateSnapshotOp fails during edit log loading for /.reserved/raw/path with > error java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/path > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-15446 > URL: https://issues.apache.org/jira/browse/HDFS-15446 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 3.2.0, 3.3.0 > Reporter: Srinivasu Majeti > Assignee: Stephen O'Donnell > Priority: Major > Labels: reserved-word, snapshot > Attachments: HDFS-15446.001.patch, HDFS-15446.002.patch, > HDFS-15446.003.patch > > > After allowing snapshot creation for a path say /app-logs , when we try to > create snapshot on > /.reserved/raw/app-logs , its successful with snapshot creation but later > when Standby Namenode is restarted and tries to load the edit record > OP_CREATE_SNAPSHOT , we see it failing and Standby Namenode shuts down with > an exception "ava.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs" . > Here are the steps to reproduce : > {code:java} > # hdfs dfs -ls /.reserved/raw/ > Found 15 items > drwxrwxrwt - yarn hadoop 0 2020-06-29 10:27 > /.reserved/raw/app-logs > drwxr-xr-x - hive hadoop 0 2020-06-29 10:29 /.reserved/raw/prod > ++++++++++++++ > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /app-logs > Allowing snapshot on /app-logs succeeded > [root@c3230-node2 ~]# hdfs dfsadmin -allowSnapshot /prod > Allowing snapshot on /prod succeeded > ++++++++++++++ > # hdfs lsSnapshottableDir > drwxrwxrwt 0 yarn hadoop 0 2020-06-29 10:27 1 65536 /app-logs > drwxr-xr-x 0 hive hadoop 0 2020-06-29 10:29 1 65536 /prod > ++++++++++++++ > [root@c3230-node2 ~]# hdfs dfs -createSnapshot /.reserved/raw/app-logs testSS > Created snapshot /.reserved/raw/app-logs/.snapshot/testSS > {code} > Exception we see in Standby namenode while loading the snapshot creation edit > record. > {code:java} > 2020-06-29 10:33:25,488 ERROR namenode.NameNode (NameNode.java:main(1715)) - > Failed to start namenode. > java.io.FileNotFoundException: Directory does not exist: > /.reserved/raw/app-logs > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.valueOf(INodeDirectory.java:60) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.getSnapshottableRoot(SnapshotManager.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.createSnapshot(SnapshotManager.java:307) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:772) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:257) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org