[ 
https://issues.apache.org/jira/browse/HDFS-15820?focusedWorklogId=547784&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547784
 ]

ASF GitHub Bot logged work on HDFS-15820:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Feb/21 19:02
            Start Date: 04/Feb/21 19:02
    Worklog Time Spent: 10m 
      Work Description: smengcl commented on a change in pull request #2682:
URL: https://github.com/apache/hadoop/pull/2682#discussion_r570461526



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##########
@@ -8531,25 +8527,37 @@ void checkAccess(String src, FsAction mode) throws 
IOException {
    * Check if snapshot roots are created for all existing snapshottable
    * directories. Create them if not.
    */
-  void checkAndProvisionSnapshotTrashRoots() throws IOException {
-    SnapshottableDirectoryStatus[] dirStatusList = 
getSnapshottableDirListing();
-    if (dirStatusList == null) {
-      return;
-    }
-    for (SnapshottableDirectoryStatus dirStatus : dirStatusList) {
-      String currDir = dirStatus.getFullPath().toString();
-      if (!currDir.endsWith(Path.SEPARATOR)) {
-        currDir += Path.SEPARATOR;
-      }
-      String trashPath = currDir + FileSystem.TRASH_PREFIX;
-      HdfsFileStatus fileStatus = getFileInfo(trashPath, false, false, false);
-      if (fileStatus == null) {
-        LOG.info("Trash doesn't exist for snapshottable directory {}. "
-            + "Creating trash at {}", currDir, trashPath);
-        PermissionStatus permissionStatus = new 
PermissionStatus(getRemoteUser()
-            .getShortUserName(), null, SHARED_TRASH_PERMISSION);
-        mkdirs(trashPath, permissionStatus, false);
+  @Override
+  public void checkAndProvisionSnapshotTrashRoots() {
+    if (isSnapshotTrashRootEnabled) {
+      try {
+        SnapshottableDirectoryStatus[] dirStatusList =
+            getSnapshottableDirListing();
+        if (dirStatusList == null) {
+          return;
+        }
+        for (SnapshottableDirectoryStatus dirStatus : dirStatusList) {
+          String currDir = dirStatus.getFullPath().toString();
+          if (!currDir.endsWith(Path.SEPARATOR)) {
+            currDir += Path.SEPARATOR;
+          }
+          String trashPath = currDir + FileSystem.TRASH_PREFIX;
+          HdfsFileStatus fileStatus = getFileInfo(trashPath, false, false, 
false);
+          if (fileStatus == null) {
+            LOG.info("Trash doesn't exist for snapshottable directory {}. " + 
"Creating trash at {}", currDir, trashPath);
+            PermissionStatus permissionStatus =
+                new PermissionStatus(getRemoteUser().getShortUserName(), null,
+                    SHARED_TRASH_PERMISSION);
+            mkdirs(trashPath, permissionStatus, false);
+          }
+        }
+      } catch (IOException e) {
+        final String msg =
+            "Could not provision Trash directory for existing "
+                + "snapshottable directories. Exiting Namenode.";
+        ExitUtil.terminate(1, msg);

Review comment:
       Pro: Terminating NN in this case is a sure good way of uncovering an 
unexpected problems instead of hiding it in the logs.
   
   Con: I wonder if we really should terminate NN when Trash directory fails to 
be deployed. We could just throw a warning message?
   
   Either way, I'm fine with both. Just a thought.

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
##########
@@ -2524,7 +2524,7 @@ public void testNameNodeCreateSnapshotTrashRootOnStartup()
     MiniDFSCluster cluster =
         new MiniDFSCluster.Builder(conf).numDataNodes(1).build();
     try {
-      final DistributedFileSystem dfs = cluster.getFileSystem();
+     DistributedFileSystem dfs = cluster.getFileSystem();

Review comment:
       nit: add one more space before this line for alignment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 547784)
    Time Spent: 20m  (was: 10m)

> Ensure snapshot root trash provisioning happens only post safe mode exit
> ------------------------------------------------------------------------
>
>                 Key: HDFS-15820
>                 URL: https://issues.apache.org/jira/browse/HDFS-15820
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, on namenode startup, snapshot trash root provisioning starts as 
> along with trash emptier service but namenode might not be out of safe mode 
> by then. This can fail the snapshot trash dir creation thereby crashing the 
> namenode. The idea here is to trigger snapshot trash provisioning only post 
> safe mode exit.
> {code:java}
> 2021-02-04 11:23:47,323 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Error encountered requiring 
> NN shutdown. Shutting down immediately.
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create 
> directory /upgrade/.Trash. Name node is in safe mode.
> The reported blocks 0 needs additional 1383 blocks to reach the threshold 
> 0.9990 of total blocks 1385.
> The number of live datanodes 0 needs an additional 1 live datanodes to reach 
> the minimum number 1.
> Safe mode will be turned off automatically once the thresholds have been 
> reached. NamenodeHostName:quasar-brabeg-5.quasar-brabeg.root.hwx.site
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1542)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1529)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3288)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAndProvisionSnapshotTrashRoots(FSNamesystem.java:8269)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1939)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:967)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:936)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1673)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1740)
> 2021-02-04 11:23:47,334 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot 
> create directory /upgrade/.Trash. Name node is in safe mode.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to