[ 
https://issues.apache.org/jira/browse/YARN-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782340#comment-16782340
 ] 

Prabhu Joseph commented on YARN-8132:
-------------------------------------

[~Rakesh_Shah] It gets triggered at {{RMAppAttemptEventType.FAIL}} when yarn 
client calls failApplicationAttempt (yarn application -fail).  When job fails 
due to tasks failing, the AM UnregisterEvent will set finalApplicationStatus to 
FAILED. But have missed for failure cases like AM Crash, AM Expire where AM 
UnregisterEvent won't be present.

[~bibinchundatt] The given fix works for Killed cases (including job timeout) 
and failure cases like - Tasks failing, Client initiates failApplicationAttempt 
but did not for AM Crash and AM Expire. Can we handle in a separate Jira or 
continue with this one.



> Final Status of applications shown as UNDEFINED in ATS app queries
> ------------------------------------------------------------------
>
>                 Key: YARN-8132
>                 URL: https://issues.apache.org/jira/browse/YARN-8132
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: ATSv2, timelineservice
>            Reporter: Charan Hebri
>            Assignee: Prabhu Joseph
>            Priority: Major
>             Fix For: 3.3.0, 3.2.1, 3.1.3
>
>         Attachments: YARN-8132-001.patch, YARN-8132-002.patch, 
> YARN-8132-003.patch, YARN-8132-004.patch, YARN-8132-branch-3.1.001.patch, 
> YARN-8132-branch-3.2.001.patch, YARN-8132-branch-3.2.002.patch
>
>
> Final Status is shown as UNDEFINED for applications that are KILLED/FAILED. A 
> sample request/response with INFO field for an application,
> {noformat}
> 2018-04-09 13:10:02,126 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1693)) - Received URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO from user 
> hrt_qa
> 2018-04-09 13:10:02,156 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getApp(1716)) - Processed URL 
> /ws/v2/timeline/apps/application_1523259757659_0003?fields=INFO (Took 30 
> ms.){noformat}
> {noformat}
> {
>   "metrics": [],
>   "events": [],
>   "createdtime": 1523263360719,
>   "idprefix": 0,
>   "id": "application_1523259757659_0003",
>   "type": "YARN_APPLICATION",
>   "info": {
>     "YARN_APPLICATION_CALLER_CONTEXT": "CLI",
>     "YARN_APPLICATION_DIAGNOSTICS_INFO": "Application 
> application_1523259757659_0003 was killed by user xxx_xx at XXX.XXX.XXX.XXX",
>     "YARN_APPLICATION_FINAL_STATUS": "UNDEFINED",
>     "YARN_APPLICATION_NAME": "Sleep job",
>     "YARN_APPLICATION_USER": "hrt_qa",
>     "YARN_APPLICATION_UNMANAGED_APPLICATION": false,
>     "FROM_ID": 
> "yarn-cluster!hrt_qa!test_flow!1523263360719!application_1523259757659_0003",
>     "UID": "yarn-cluster!application_1523259757659_0003",
>     "YARN_APPLICATION_VIEW_ACLS": " ",
>     "YARN_APPLICATION_SUBMITTED_TIME": 1523263360718,
>     "YARN_AM_CONTAINER_LAUNCH_COMMAND": [
>       "$JAVA_HOME/bin/java -Djava.io.tmpdir=$PWD/tmp 
> -Dlog4j.configuration=container-log4j.properties 
> -Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 
> -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog 
> -Dhdp.version=3.0.0.0-1163 -Xmx819m -Dhdp.version=3.0.0.0-1163 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 
> 2><LOG_DIR>/stderr "
>     ],
>     "YARN_APPLICATION_QUEUE": "default",
>     "YARN_APPLICATION_TYPE": "MAPREDUCE",
>     "YARN_APPLICATION_PRIORITY": 0,
>     "YARN_APPLICATION_LATEST_APP_ATTEMPT": 
> "appattempt_1523259757659_0003_000001",
>     "YARN_APPLICATION_TAGS": [
>       "timeline_flow_name_tag:test_flow"
>     ],
>     "YARN_APPLICATION_STATE": "KILLED"
>   },
>   "configs": {},
>   "isrelatedto": {},
>   "relatesto": {}
> }{noformat}
> This is different to what the Resource Manager reports. For KILLED 
> applications the final status is KILLED and for FAILED applications it is 
> FAILED. This behavior is seen in ATSv2 as well as older versions of ATS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to