[jira] [Commented] (HDFS-15346) RBF: DistCpFedBalance implementation

Yiqun Lin (Jira) Sat, 13 Jun 2020 21:59:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135024#comment-17135024
 ]


Yiqun Lin commented on HDFS-15346:
----------------------------------

[~LiJinglun], thanks for addressing remaining comments.

These two days, I am trying to improve the efficiency of the unit test, current 
unit test is too slow.

I find another way that we don't have to depend on mini yarn cluster in test 
running. The job can submitted and executed in LocalJobRunner way. But we need 
to make an adjustment in getting job status from job client.

I do some refactor in getCurrent method and apply them in DistCpProcedure.

Following are part of some necessary change we need to update.
{noformat}
  @VisibleForTesting
  private Job runningJob;
  static boolean ENABLED_FOR_TEST = false;
...
  private String submitDistCpJob(String srcParam, String dstParam,
      boolean useSnapshotDiff) throws IOException {
    ...
    try {
      LOG.info("Submit distcp job={}", job);
      runningJob = job;   <--- need to reset there
      return job.getJobID().toString();
    } catch (Exception e) {
      throw new IOException("Submit job failed.", e);
    }
  }

  private RunningJobStatus getCurrentJob() throws IOException {
    if (jobId != null) {
      if (ENABLED_FOR_TEST) {
        if (this.runningJob != null) {
          Job latestJob = null;
          try {
            latestJob = this.runningJob.getCluster()
                .getJob(JobID.forName(jobId));
          } catch (InterruptedException e) {
            throw new IOException(e);
          }
          return latestJob == null ? null
              : new RunningJobStatus(latestJob, null);
        }
      } else {
        RunningJob latestJob = client.getJob(JobID.forName(jobId));
        return latestJob == null ? null :
          new RunningJobStatus(null, latestJob);
      }
    }
    return null;
  }

  class RunningJobStatus {
    Job testJob;
    RunningJob job;

    public RunningJobStatus(Job testJob, RunningJob job) {
      this.testJob = testJob;
      this.job = job;
    }

    String getJobID() {
      return ENABLED_FOR_TEST ? testJob.getJobID().toString()
          : job.getID().toString();
    }

    boolean isComplete() throws IOException {
      return ENABLED_FOR_TEST ? testJob.isComplete() : job.isComplete();
    }

    boolean isSuccessful() throws IOException {
      return ENABLED_FOR_TEST ? testJob.isSuccessful() : job.isSuccessful();
    }

    String getFailureInfo() throws IOException {
      try {
        return ENABLED_FOR_TEST ? testJob.getStatus().getFailureInfo()
            : job.getFailureInfo();
      } catch (InterruptedException e) {
        throw new IOException(e);
      }
    }
  }
{noformat}
And mini yarn cluster related code lines can all be removed (include two pom 
dependencies mentioned above)
{code:java}
+    mrCluster = new MiniMRYarnCluster(TestDistCpProcedure.class.getName(), 3);
+    conf.set(MRJobConfig.MR_AM_STAGING_DIR, "/apps_staging_dir");
+    mrCluster.init(conf);
+    mrCluster.start();
+    conf = mrCluster.getConfig();
{code}
We need additionally set test enabled flag.
{code:java}
 public static void beforeClass() throws IOException {
    DistCpProcedure.ENABLED_FOR_TEST = true;
...
}
{code}
After this improvement, the whole test runs very faster than before, it totally 
costs less than 1 min.

Also I catch some places still needed to update.
 # Can you update following description in router option? I update this content 
as well but seems this was not addressed in the latest patch.
{noformat}
It will disable read and write by cancelling all permissions of the source 
path. The default value  is `false`."
{noformat}

 # Method name cleanUpBeforeInitDistcp can be renamed to 
pathCheckBeforeInitDistcp since we don't do any cleanup operation now.

> RBF: DistCpFedBalance implementation
> ------------------------------------
>
>                 Key: HDFS-15346
>                 URL: https://issues.apache.org/jira/browse/HDFS-15346
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jinglun
>            Assignee: Jinglun
>            Priority: Major
>         Attachments: HDFS-15346.001.patch, HDFS-15346.002.patch, 
> HDFS-15346.003.patch, HDFS-15346.004.patch, HDFS-15346.005.patch, 
> HDFS-15346.006.patch, HDFS-15346.007.patch, HDFS-15346.008.patch, 
> HDFS-15346.009.patch, HDFS-15346.010.patch
>
>
> Patch in HDFS-15294 is too big to review so we split it into 2 patches. This 
> is the second one. Detail can be found at HDFS-15294.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15346) RBF: DistCpFedBalance implementation

Reply via email to