[ 
https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127489#comment-17127489
 ] 

Yiqun Lin edited comment on HDFS-15346 at 6/7/20, 4:35 AM:
-----------------------------------------------------------

Some more detailed review comments:

*HdfsConstants.java*
 Can we rename DOT_SNAPSHOT_SEPARATOR_DIR to the more readable name 
DOT_SNAPSHOT_DIR_SEPARATOR?

*DistCpFedBalance.java*
 # It would good to print the fed context that created from input options, so 
that we will know final options that we passed in.
{noformat}
+.     // -->  print fed balancer context
+      // Construct the balance job.
+      BalanceJob.Builder<BalanceProcedure> builder = new 
BalanceJob.Builder<>();
+      DistCpProcedure dcp =
+          new DistCpProcedure(DISTCP_PROCEDURE, null, delayDuration, context);
+      builder.nextProcedure(dcp);
{noformat}

 # We can replace this system out in LOG instance,
{noformat}
+        for (BalanceJob job : jobs) {
+          if (!job.isJobDone()) {
+            unfinished++;
+          }
+          System.out.println(job);
+        }
{noformat}

*DistCpProcedure.java*
 # The message in IOException(src + " doesn't exist.") not correctly described, 
should be 'src + " should be the directory."'
 # For each stage change, can we add aN additional output log, like this:
{noformat}
+    if (srcFs.exists(new Path(src, HdfsConstants.DOT_SNAPSHOT_DIR))) {
+      throw new IOException(src + " shouldn't enable snapshot.");
+    }
     LOG.info("Stage updated from {} to {}.", stage.name(), 
Stage.INIT_DISTCP.name())
+    stage = Stage.INIT_DISTCP;
+  }
{noformat}

 # Here we reset permission to 0, that means no any operation is allowed? Is 
this expected, why not is 400 (only allow read)? The comment said that 
'cancelling the x permission of the source path.' makes me confused.
{noformat}
srcFs.setPermission(src, FsPermission.createImmutable((short) 0));
{noformat}

 # I prefer to throw IOException rather than doing delete operation in 
cleanUpBeforeInitDistcp. cleanUpBeforeInitDistcp is expected to be the final 
pre-check function before submitting ditcp job. And let admin users to check 
and do delete operation manually by themself.
{noformat}
+  private void initialCheckBeforeInitDistcp() throws IOException {
+    if (dstFs.exists(dst)) {
+     throw IOException(....);
+    }
+    srcFs.allowSnapshot(src);
+    if (srcFs.exists(new Path(src,
+        HdfsConstants.DOT_SNAPSHOT_SEPARATOR_DIR + CURRENT_SNAPSHOT_NAME))) {
        throw IOException(....);
+    }
{noformat}

*FedBalanceConfigs.java*
 Can we move all keys from BalanceProcedureConfigKeys to this class? We don't 
need two duplicated Config class. One follow-up task I am thinking that we can 
have a separated config file something named fedbalance-default.xml for 
fedbalance tool, like ditcp-default.xml for distcp tool now. I don't prefer to 
add all tool config settings into hdfs-default.xml.

*FedBalanceContext.java*
 Override the toString method in FedBalanceContext to help us know the input 
options that actually be used.

*MountTableProcedure.java*
 The for loop can just break once we find the first source path that matched.
{noformat}
+    for (MountTable result : results) {
+      if (mount.equals(result.getSourcePath())) {
+          existingEntry = result;
           break;   
+      }
+    }
{noformat}
*TrashProcedure.java*
{noformat}
+  /**
+   * Delete source path to trash.
+   */
+  void moveToTrash() throws IOException {
+    Path src = context.getSrc();
+    if (srcFs.exists(src)) {
+      switch (context.getTrashOpt()) {
+      case TRASH:
+        conf.setFloat(FS_TRASH_INTERVAL_KEY, 1);
+        if (!Trash.moveToAppropriateTrash(srcFs, src, conf)) {
+          throw new IOException("Failed move " + src + " to trash.");
+        }
+        break;
+      case DELETE:
+        if (!srcFs.delete(src, true)) {
+          throw new IOException("Failed delete " + src);
+        }
+        LOG.info("{} is deleted.", src);
+        break;
+      default:
+        break;
+      }
+    }
+  }
{noformat}
For above lines, two review comments:
 # Can we add SKIP option check as well and throw unexpected option error?
{noformat}
        case SKIP:
        break;
+      default:
+      throw new IOException("Unexpected trash option=" + 
context.getTrashOpt());
+      }
{noformat}

 # FS_TRASH_INTERVAL_KEY defined with 1 is too small, that means we the trash 
will be deleted after 1 minute. Can you increased this to 60? Also please add 
necessary comment in trash option description to say the default trash behavior 
when trash is disabled in server side and client side value will be used.


was (Author: linyiqun):
Some more detailed review comments:

*HdfsConstants.java*
 Can we rename DOT_SNAPSHOT_SEPARATOR_DIR to the more readable name 
DOT_SNAPSHOT_DIR_SEPARATOR?

*DistCpFedBalance.java*
 # It would good to print the fed context that created from input options, so 
that we will know final options that we passed in.
{noformat}
+.     // -->  print fed balancer context
+      // Construct the balance job.
+      BalanceJob.Builder<BalanceProcedure> builder = new 
BalanceJob.Builder<>();
+      DistCpProcedure dcp =
+          new DistCpProcedure(DISTCP_PROCEDURE, null, delayDuration, context);
+      builder.nextProcedure(dcp);
{noformat}
 # We can replace this system out in LOG instance,
{noformat}
+        for (BalanceJob job : jobs) {
+          if (!job.isJobDone()) {
+            unfinished++;
+          }
+          System.out.println(job);
+        }
{noformat}

*DistCpProcedure.java*
 # The message in IOException(src + " doesn't exist.") not correctly described, 
should be 'src + " should be the directory."'
 # For each stage change, can we add aN additional output log, like this:
{noformat}
+    if (srcFs.exists(new Path(src, HdfsConstants.DOT_SNAPSHOT_DIR))) {
+      throw new IOException(src + " shouldn't enable snapshot.");
+    }
     LOG.info("Stage updated from {} to {}.", stage.name(), 
Stage.INIT_DISTCP.name())
+    stage = Stage.INIT_DISTCP;
+  }
{noformat}
 # Here we reset permission to 0, that means no any operation is allowed? Is 
this expected, why not is 400 (only allow read)? The comment said that 
'cancelling the x permission of the source path.' makes me confused.
{noformat}
srcFs.setPermission(src, FsPermission.createImmutable((short) 0));
{noformat}
 # I prefer to throw IOException rather than doing delete operation in 
cleanUpBeforeInitDistcp. cleanUpBeforeInitDistcp is expected to be the final 
pre-check function before submit ditcp job.
{noformat}
+  private void initialCheckBeforeInitDistcp() throws IOException {
+    if (dstFs.exists(dst)) {
+     throw IOException(....);
+    }
+    srcFs.allowSnapshot(src);
+    if (srcFs.exists(new Path(src,
+        HdfsConstants.DOT_SNAPSHOT_SEPARATOR_DIR + CURRENT_SNAPSHOT_NAME))) {
        throw IOException(....);
+    }
{noformat}

*FedBalanceConfigs.java*
 Can we move all keys from BalanceProcedureConfigKeys to this class? We don't 
need two duplicated Config class. One follow-up task I am thinking that we can 
have a separated config file something named fedbalance-default.xml for 
fedbalance tool, like ditcp-default.xml for distcp tool now. I don't prefer to 
add all tool config settings into hdfs-default.xml.

*FedBalanceContext.java*
 Override the toString method in FedBalanceContext to help us know the input 
options that actually be used.

*MountTableProcedure.java*
 The for loop can just break once we find the first source path that matched.
{noformat}
+    for (MountTable result : results) {
+      if (mount.equals(result.getSourcePath())) {
+          existingEntry = result;
           break;   
+      }
+    }
{noformat}

*TrashProcedure.java*
{noformat}
+  /**
+   * Delete source path to trash.
+   */
+  void moveToTrash() throws IOException {
+    Path src = context.getSrc();
+    if (srcFs.exists(src)) {
+      switch (context.getTrashOpt()) {
+      case TRASH:
+        conf.setFloat(FS_TRASH_INTERVAL_KEY, 1);
+        if (!Trash.moveToAppropriateTrash(srcFs, src, conf)) {
+          throw new IOException("Failed move " + src + " to trash.");
+        }
+        break;
+      case DELETE:
+        if (!srcFs.delete(src, true)) {
+          throw new IOException("Failed delete " + src);
+        }
+        LOG.info("{} is deleted.", src);
+        break;
+      default:
+        break;
+      }
+    }
+  }
{noformat}
For above lines, two review comments:
# Can we add SKIP option check as well and throw unexpected option error?
{noformat}
        case SKIP:
        break;
+      default:
+      throw new IOException("Unexpected trash option=" + 
context.getTrashOpt());
+      }
{noformat}
# FS_TRASH_INTERVAL_KEY defined with 1 is too small, that means we the trash 
will be deleted after 1 minute. Can you increased this to 60?  Also please add 
necessary comment in trash option description to say the default trash behavior 
when trash is disabled in server side and client side value will be used.

> RBF: DistCpFedBalance implementation
> ------------------------------------
>
>                 Key: HDFS-15346
>                 URL: https://issues.apache.org/jira/browse/HDFS-15346
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HDFS-15346.001.patch, HDFS-15346.002.patch, 
> HDFS-15346.003.patch, HDFS-15346.004.patch, HDFS-15346.005.patch, 
> HDFS-15346.006.patch, HDFS-15346.007.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the second one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to