[ https://issues.apache.org/jira/browse/MAPREDUCE-5641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907656#comment-13907656 ]
Zhijie Shen commented on MAPREDUCE-5641: ---------------------------------------- bq. So it sounds like instead of doing YARN-1731 to make the RM write a little flag file that the JHS can check for, we can have the JHS check this store just like the AHS is doing. That should be cleaner. It could be an option, but depends on what information you want. According to my previous understanding, you plan to inspect jhist file, and probably look for MR specific information, such as map, reduce, shuffle, merge and etc. It cannot be obtained from AHS. In contrast, some other generic information, such as start time, finish time, host and etc can be obtained from AHS. Perhaps, you can choose to recover part of information for failed MR AM now, and make a complete recovery whenever MR reports its specific information to timeline service. bq. What is the store that its using? And where can I find out more about it or its API so I can update this patch to use it. The suggested way to access the information is not read from the store directly, but use AHSClient or web services, suppose you are going to programmatically do this. > History for failed Application Masters should be made available to the Job > History Server > ----------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5641 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5641 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster, jobhistoryserver > Affects Versions: 2.2.0 > Reporter: Robert Kanter > Assignee: Robert Kanter > Attachments: MAPREDUCE-5641.patch, MAPREDUCE-5641.patch > > > Currently, the JHS has no information about jobs whose AMs have failed. This > is because the History is written by the AM to the intermediate folder just > before finishing, so when it fails for any reason, this information isn't > copied there. However, it is not lost as its in the AM's staging directory. > To make the History available in the JHS, all we need to do is have another > mechanism to move the History from the staging directory to the intermediate > directory. The AM also writes a "Summary" file before exiting normally, > which is also unavailable when the AM fails. -- This message was sent by Atlassian JIRA (v6.1.5#6160)