[ https://issues.apache.org/jira/browse/MAPREDUCE-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13549100#comment-13549100 ]
Jason Lowe commented on MAPREDUCE-4848: --------------------------------------- +1 lgtm. > TaskAttemptContext cast error during AM recovery > ------------------------------------------------ > > Key: MAPREDUCE-4848 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4848 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am > Affects Versions: 0.23.4 > Reporter: Jason Lowe > Assignee: Jerry Chen > Fix For: trunk > > Attachments: MAPREDUCE-4848.patch > > > Recently saw an AM that failed and tried to recover, but the subsequent > attempt quickly exited with its own failure during recovery: > {noformat} > 2012-12-05 02:33:36,752 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.ClassCastException: > org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl cannot be cast to > org.apache.hadoop.mapred.TaskAttemptContext > at > org.apache.hadoop.mapred.OutputCommitter.recoverTask(OutputCommitter.java:284) > at > org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$InterceptingEventHandler.handle(RecoveryService.java:361) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1211) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1177) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:958) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:135) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:926) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:918) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:285) > at > org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:281) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:619) > 2012-12-05 02:33:36,752 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.. > {noformat} > The RM then launched a third AM attempt which succeeded. The third attempt > saw basically no progress after parsing the history file from the second > attempt and ran the job again from scratch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira