[ https://issues.apache.org/jira/browse/HDFS-15614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319168#comment-17319168 ]
Shashikant Banerjee edited comment on HDFS-15614 at 4/12/21, 8:46 AM: ---------------------------------------------------------------------- Thanks [~ayushtkn]. The "getAllSnapshottableDirs()" in itslef is not a heavy call IMO. It does not depend on the no of snapshots present in the system. {code:java} 1. What if the mkdirs fail? the namenode will crash, ultimately all Namenodes will try this stuff in an attempt to become active and come out of safemode. Hence all the namenodes will crash. Why mkdirs can fail, could be many reasons, I can tell you one which I tried: Namespace Quotas, and yep the namenode crashed. can be bunch of such cases {code} If mkdir fails to create the Trash directory , inside the snapshot root, then strict ordering/processing of all entries during snapshot deletion can not be guaranteed, If this feature needs to be used, .Trash needs to be within the snapshottable directory which is similar to the case with encryption zones. {code:java} 2. Secondly, An ambiguity, A client did an allowSnapshot say not from HdfsAdmin he didn't had any Trash directory in the snapshot dir, Suddenly a failover happened, he would get a trash directory in its snapshot directory, Which he never created.{code} If a new directory is made snapshottable with feature flahg turned , .Trash directory gets created impliclitly as a part of allowSnapshot call. I don't think there is an ambiguity here. {code:java} Third, The time cost, The namenode startup or the namenode failover or let it be coming out of safemode should be fast, They are actually contributing to cluster down time, and here we are doing like first getSnapshottableDirs which itself would be a heavy call if you have a lot of snapshots, then for each directory, one by one we are doing a getFileInfo and then a mkdir, seems like time-consuming. Not sure about the memory consumption at that point due to this though... {code} I don't think getSnapshottableDirs() is a very heavey call in typical setups. It has nothing to do with the no of snapshots that exist in the sytem. {code:java} Fourth, Why the namenode needs to do a client operation? It is the server. And that too while starting up, This mkdirs from namenode to self is itself suspicious, Bunch of namenode crashing coming up trying to become active, trying to push same edits, Hopefully you would have taken that into account and pretty sure such things won't occur, Namenodes won't collide even in the rarest cases. yep and all safe with the permissions.. {code} This is important for provisioning snapshot trash to use ordered snapshot deletion feature if the system already had pre existing snapshottable directories. was (Author: shashikant): Thanks [~ayushtkn]. The "getAllSnapshottableDirs()" in itslef is not a heavy call IMO. It does not depend on the no of snapshots present in the system. {code:java} 1. What if the mkdirs fail? the namenode will crash, ultimately all Namenodes will try this stuff in an attempt to become active and come out of safemode. Hence all the namenodes will crash. Why mkdirs can fail, could be many reasons, I can tell you one which I tried: Namespace Quotas, and yep the namenode crashed. can be bunch of such cases {code} If mkdir fails to create the Trash directory , inside the snapshot root, then strict ordering/processing of all entries during snapshot deletion can not be guaranteed, If this feature needs to be used, .Trash needs to be within the snapshottable directory which is similar to the case with encryption zones. {code:java} 2. Secondly, An ambiguity, A client did an allowSnapshot say not from HdfsAdmin he didn't had any Trash directory in the snapshot dir, Suddenly a failover happened, he would get a trash directory in its snapshot directory, Which he never created.{code} If a new directory is made snapshottable with feature flahg turned , .Trash directory gets created impliclitly as a part of allowSnapshot call. I don't think there is an ambiguity here. {code:java} Third, The time cost, The namenode startup or the namenode failover or let it be coming out of safemode should be fast, They are actually contributing to cluster down time, and here we are doing like first getSnapshottableDirs which itself would be a heavy call if you have a lot of snapshots, then for each directory, one by one we are doing a getFileInfo and then a mkdir, seems like time-consuming. Not sure about the memory consumption at that point due to this though... {code} I don't think getSnapshottableDirs() is a very heavey call in typical setups. It has nothing to do with the no of snapshots that exist in the sytem. {code:java} Fourth, Why the namenode needs to do a client operation? It is the server. And that too while starting up, This mkdirs from namenode to self is itself suspicious, Bunch of namenode crashing coming up trying to become active, trying to push same edits, Hopefully you would have taken that into account and pretty sure such things won't occur, Namenodes won't collide even in the rarest cases. yep and all safe with the permissions.. {code} This is important for provisioning snapshot trash to use ordered snapshot deletion feature if the system already had pre existing snapshottable directories. > Initialize snapshot trash root during NameNode startup if enabled > ----------------------------------------------------------------- > > Key: HDFS-15614 > URL: https://issues.apache.org/jira/browse/HDFS-15614 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Siyao Meng > Assignee: Siyao Meng > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > This is a follow-up to HDFS-15607. > Goal: > Initialize (create) snapshot trash root for all existing snapshottable > directories if {{dfs.namenode.snapshot.trashroot.enabled}} is set to > {{true}}. So admins won't have to run {{dfsadmin -provisionTrash}} manually > on all those existing snapshottable directories. > The change is expected to land in {{FSNamesystem}}. > Discussion: > 1. Currently in HDFS-15607, the snapshot trash root creation logic is on the > client side. But in order for NN to create it at startup, the logic must > (also) be implemented on the server side as well. -- which is also a > requirement by WebHDFS (HDFS-15612). > 2. Alternatively, we can provide an extra parameter to the > {{-provisionTrash}} command like: {{dfsadmin -provisionTrash -all}} to > initialize/provision trash root on all existing snapshottable dirs. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org