[ https://issues.apache.org/jira/browse/MAPREDUCE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146537#comment-15146537 ]
Naganarasimha G R commented on MAPREDUCE-6579: ---------------------------------------------- Thanks [~jlowe],[~sunilg] & [~ajisakaa], for sharing your views but we have already taken care of what you guys have suggested. YARN-3946 takes care of resetting the diagnostic message once the AM is registered already in {{RMAppAttemptImpl.AMRegisteredTransition}} {code} @@ -1507,6 +1515,9 @@ public void transition(RMAppAttemptImpl appAttempt, appAttempt.originalTrackingUrl = sanitizeTrackingUrl(registrationEvent.getTrackingurl()); + // reset AMLaunchDiagnostics once AM Registers with RM + appAttempt.updateAMLaunchDiagnostics(null); + {code} [~jlowe], As we have ensured that the Am launch Diagnostic info is reset once the AM is registered, do you see any further problems with the approach taken in YARN-3946 ? IMHO it was very useful information to get this statistics and it was apt to put it into App's Diagnostic information instead of creating a new *AM Launch diagnostic* and i have taken care such that if anywhere diagnostic is set then *RMAppImpl or RMAppAttemptImpl* will not return the *AM Launch diagnostic* but return the actual message. Further in Map reduce, currently its assumed that the diagnostics are always *failure diagnostics* which IMO is wrong. As per *MAPREDUCE-6579.05.patch* {{JobStatus}} is as follows {code} public synchronized String getFailureInfo() { - return this.failureInfo; + if (runState == State.FAILED || runState == State.KILLED) { + return this.failureInfo; + } + return ""; } {code} As you were informing this might not be correct place to do this check, whether would it be better to handle in {{NotRunningJob.getJobReport}} ? [~ajisakaa], As per modifications in *MAPREDUCE-6579.06*, you have reset the diagnostics in *RMAppAttemptImpl.AMRegisteredTransition* just after the location i have reset. I commented that code and ran the test case, it ran successfully. So IMO the modification in the testcase is fine but RMAppAttemptImpl modifications are not required and at the same time we need to discuss whether we need to add additional check in {{NotRunningJob.getJobReport}} > JobStatus#getFailureInfo should not output diagnostic information when the > job is running > ----------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6579 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6579 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test > Reporter: Rohith Sharma K S > Assignee: Akira AJISAKA > Priority: Blocker > Attachments: MAPREDUCE-6579.01.patch, MAPREDUCE-6579.02.patch, > MAPREDUCE-6579.03.patch, MAPREDUCE-6579.04.patch, MAPREDUCE-6579.05.patch, > MAPREDUCE-6579.06.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestNetworkedJob are failed intermittently. > {code} > Running org.apache.hadoop.mapred.TestNetworkedJob > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 81.131 sec > <<< FAILURE! - in org.apache.hadoop.mapred.TestNetworkedJob > testNetworkedJob(org.apache.hadoop.mapred.TestNetworkedJob) Time elapsed: > 30.55 sec <<< FAILURE! > org.junit.ComparisonFailure: expected:<[[Tue Dec 15 14:02:45 +0000 2015] > Application is Activated, waiting for resources to be assigned for AM. > Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = > <memory:8192, vCores:16> ; Queue's Absolute capacity = 100.0 % ; Queue's > Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> > but was:<[]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapred.TestNetworkedJob.testNetworkedJob(TestNetworkedJob.java:174) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)