[ https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782340#comment-16782340 ]
Prabhu Joseph commented on YARN-8132: ------------------------------------- [~Rakesh_Shah] It gets triggered at {{RMAppAttemptEventType.FAIL}} when yarn client calls failApplicationAttempt (yarn application -fail). When job fails due to tasks failing, the AM UnregisterEvent will set finalApplicationStatus to FAILED. But have missed for failure cases like AM Crash, AM Expire where AM UnregisterEvent won't be present. [~bibinchundatt] The given fix works for Killed cases (including job timeout) and failure cases like - Tasks failing, Client initiates failApplicationAttempt but did not for AM Crash and AM Expire. Can we handle in a separate Jira or continue with this one. > Final Status of applications shown as UNDEFINED in ATS app queries > ------------------------------------------------------------------ > > Key: YARN-8132 > URL: https://issues.apache.org/jira/browse/YARN-8132 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2, timelineservice > Reporter: Charan Hebri > Assignee: Prabhu Joseph > Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-8132-001.patch, YARN-8132-002.patch, > YARN-8132-003.patch, YARN-8132-004.patch, YARN-8132-branch-3.1.001.patch, > YARN-8132-branch-3.2.001.patch, YARN-8132-branch-3.2.002.patch > > > Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A > sample request/response with INFO field for an application, > {noformat} > 2018-04-09 13:10:02,126 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1693)) - Received URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user > hrt_qa > 2018-04-09 13:10:02,156 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getApp(1716)) - Processed URL > /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 > ms.){noformat} > {noformat} > { > "metrics": [], > "events": [], > "createdtime": 1523263360719, > "idprefix": 0, > "id": "application_1523259757659_0003", > "type": "YARN_APPLICATION", > "info": { > "YARN_APPLICATION_CALLER_CONTEXT": "CLI", > "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application > application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX", > "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED", > "YARN_APPLICATION_NAME": "Sleep job", > "YARN_APPLICATION_USER": "hrt_qa", > "YARN_APPLICATION_UNMANAGED_APPLICATION": false, > "FROM_ID": > "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003", > "UID": "yarn-cluster!application_1523259757659_0003", > "YARN_APPLICATION_VIEW_ACLS": " ", > "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718, > "YARN_AM_CONTAINER_LAUNCH_COMMAND": [ > "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp > -Dlog4j.configuration=container-log4j.properties > -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 > -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog > -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 > org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout > 2><LOG_DIR>/stderr " > ], > "YARN_APPLICATION_QUEUE": "default", > "YARN_APPLICATION_TYPE": "MAPREDUCE", > "YARN_APPLICATION_PRIORITY": 0, > "YARN_APPLICATION_LATEST_APP_ATTEMPT": > "appattempt_1523259757659_0003_000001", > "YARN_APPLICATION_TAGS": [ > "timeline_flow_name_tag:test_flow" > ], > "YARN_APPLICATION_STATE": "KILLED" > }, > "configs": {}, > "isrelatedto": {}, > "relatesto": {} > }{noformat} > This is different to what the Resource Manager reports. For KILLED > applications the final status is KILLED and for FAILED applications it is > FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org