[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907656#comment-13907656
 ] 

Zhijie Shen commented on MAPREDUCE-5641:
----------------------------------------

bq. So it sounds like instead of doing YARN-1731 to make the RM write a little 
flag file that the JHS can check for, we can have the JHS check this store just 
like the AHS is doing. That should be cleaner.

It could be an option, but depends on what information you want. According to 
my previous understanding, you plan to inspect jhist file, and probably look 
for MR specific information, such as map, reduce, shuffle, merge and etc. It 
cannot be obtained from AHS. In contrast, some other generic information, such 
as start time, finish time, host and etc can be obtained from AHS. Perhaps, you 
can choose to recover part of information for failed MR AM now, and make a 
complete recovery whenever MR reports its specific information to timeline 
service.

bq. What is the store that its using? And where can I find out more about it or 
its API so I can update this patch to use it.

The suggested way to access the information is not read from the store 
directly, but use AHSClient or web services, suppose you are going to 
programmatically do this.


> History for failed Application Masters should be made available to the Job 
> History Server
> -----------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5641
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5641
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, jobhistoryserver
>    Affects Versions: 2.2.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: MAPREDUCE-5641.patch, MAPREDUCE-5641.patch
>
>
> Currently, the JHS has no information about jobs whose AMs have failed.  This 
> is because the History is written by the AM to the intermediate folder just 
> before finishing, so when it fails for any reason, this information isn't 
> copied there.  However, it is not lost as its in the AM's staging directory.  
> To make the History available in the JHS, all we need to do is have another 
> mechanism to move the History from the staging directory to the intermediate 
> directory.  The AM also writes a "Summary" file before exiting normally, 
> which is also unavailable when the AM fails.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to