[ https://issues.apache.org/jira/browse/HDFS-15346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135024#comment-17135024 ]
Yiqun Lin commented on HDFS-15346: ---------------------------------- [~LiJinglun], thanks for addressing remaining comments. These two days, I am trying to improve the efficiency of the unit test, current unit test is too slow. I find another way that we don't have to depend on mini yarn cluster in test running. The job can submitted and executed in LocalJobRunner way. But we need to make an adjustment in getting job status from job client. I do some refactor in getCurrent method and apply them in DistCpProcedure. Following are part of some necessary change we need to update. {noformat} @VisibleForTesting private Job runningJob; static boolean ENABLED_FOR_TEST = false; ... private String submitDistCpJob(String srcParam, String dstParam, boolean useSnapshotDiff) throws IOException { ... try { LOG.info("Submit distcp job={}", job); runningJob = job; <--- need to reset there return job.getJobID().toString(); } catch (Exception e) { throw new IOException("Submit job failed.", e); } } private RunningJobStatus getCurrentJob() throws IOException { if (jobId != null) { if (ENABLED_FOR_TEST) { if (this.runningJob != null) { Job latestJob = null; try { latestJob = this.runningJob.getCluster() .getJob(JobID.forName(jobId)); } catch (InterruptedException e) { throw new IOException(e); } return latestJob == null ? null : new RunningJobStatus(latestJob, null); } } else { RunningJob latestJob = client.getJob(JobID.forName(jobId)); return latestJob == null ? null : new RunningJobStatus(null, latestJob); } } return null; } class RunningJobStatus { Job testJob; RunningJob job; public RunningJobStatus(Job testJob, RunningJob job) { this.testJob = testJob; this.job = job; } String getJobID() { return ENABLED_FOR_TEST ? testJob.getJobID().toString() : job.getID().toString(); } boolean isComplete() throws IOException { return ENABLED_FOR_TEST ? testJob.isComplete() : job.isComplete(); } boolean isSuccessful() throws IOException { return ENABLED_FOR_TEST ? testJob.isSuccessful() : job.isSuccessful(); } String getFailureInfo() throws IOException { try { return ENABLED_FOR_TEST ? testJob.getStatus().getFailureInfo() : job.getFailureInfo(); } catch (InterruptedException e) { throw new IOException(e); } } } {noformat} And mini yarn cluster related code lines can all be removed (include two pom dependencies mentioned above) {code:java} + mrCluster = new MiniMRYarnCluster(TestDistCpProcedure.class.getName(), 3); + conf.set(MRJobConfig.MR_AM_STAGING_DIR, "/apps_staging_dir"); + mrCluster.init(conf); + mrCluster.start(); + conf = mrCluster.getConfig(); {code} We need additionally set test enabled flag. {code:java} public static void beforeClass() throws IOException { DistCpProcedure.ENABLED_FOR_TEST = true; ... } {code} After this improvement, the whole test runs very faster than before, it totally costs less than 1 min. Also I catch some places still needed to update. # Can you update following description in router option? I update this content as well but seems this was not addressed in the latest patch. {noformat} It will disable read and write by cancelling all permissions of the source path. The default value is `false`." {noformat} # Method name cleanUpBeforeInitDistcp can be renamed to pathCheckBeforeInitDistcp since we don't do any cleanup operation now. > RBF: DistCpFedBalance implementation > ------------------------------------ > > Key: HDFS-15346 > URL: https://issues.apache.org/jira/browse/HDFS-15346 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Jinglun > Assignee: Jinglun > Priority: Major > Attachments: HDFS-15346.001.patch, HDFS-15346.002.patch, > HDFS-15346.003.patch, HDFS-15346.004.patch, HDFS-15346.005.patch, > HDFS-15346.006.patch, HDFS-15346.007.patch, HDFS-15346.008.patch, > HDFS-15346.009.patch, HDFS-15346.010.patch > > > Patch in HDFS-15294 is too big to review so we split it into 2 patches. This > is the second one. Detail can be found at HDFS-15294. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org