[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479306#comment-13479306
 ] 

Thomas Graves commented on MAPREDUCE-4729:
------------------------------------------

Ok, so I figured this out.  The job is using output format 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat, 
which has the OutputCommitter which is set to null.  This caused the 
MRAppMaster recoveryService to not start:

 "org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Not starting RecoveryService: 
recoveryEnabled: true recoverySupportedByCommitter: false ApplicationAttemptID: 
4"

Since the recovery service didn't start it didn't parse the old job history 
files, thus didn't have the list of old AMs. 

I think we should fix that so that even if recovery isn't supported we atleast 
parse and get the previous AM attempt info.
                
> job history UI not showing all job attempts
> -------------------------------------------
>
>                 Key: MAPREDUCE-4729
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4729
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>    Affects Versions: 0.23.3
>            Reporter: Thomas Graves
>
> We are seeing a case where a job runs but the AM is running out of memory in 
> the first 3 attempts. The job eventually finishes on the 4th attempt.  When 
> you go to the job history UI for that job, it only shows the last attempt.  
> This is bad since we want to see why the first 3 attempts failed.
> The RM web ui shows all 4 attempts. 
> Also I tested this locally by running "kill" on the app master and in that 
> case the history server UI does show all attempts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to