[jira] [Resolved] (MAPREDUCE-5444) MRAppMaster throws InvalidStateTransitonException: Invalid event: JOB_AM_REBOOT at SUCCEEDED

Jason Lowe (JIRA) Fri, 02 Aug 2013 07:54:34 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jason Lowe resolved MAPREDUCE-5444.
-----------------------------------

    Resolution: Invalid

bq. I have one point to add here that, immidiately after job is succeeded , app 
master got reboot command from RM. JobClient is exitted( see MAPREDUCE-5441 ). 
By the time, RM has launched 2nd attempt of app master. 2nd attempt app master 
too compete for resources, but there is no client waiting getting job report.I 
feel this is problem.

There will always be a race where the job has just succeeded but the RM gets 
out of sync with the AM before the AM can unregister.  Normally the AM will 
exit, another AM attempt will be launched by the RM, and the new attempt will 
recover the previous SUCCEEDED state and exit shortly afterwards without 
launching any subsequent tasks.

As for the client, that's an orthogonal problem.  It's not required that a 
client be listening to an application as it executes, and if the client is 
unnecessarily exiting across an AM restart then we can tackle that issue in 
MAPREDUCE-5441.
                
> MRAppMaster throws InvalidStateTransitonException: Invalid event: 
> JOB_AM_REBOOT at SUCCEEDED
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5444
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5444
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: applicationmaster
>            Reporter: Rohith Sharma K S
>            Priority: Minor
>
> {noformat}
> 2013-08-02 14:55:11,537 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for 
> JobFinishedEvent 
> 2013-08-02 14:55:11,538 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
> job_1375199817609_0049Job Transitioned from COMMITTING to SUCCEEDED
> 2013-08-02 14:55:11,663 INFO [Thread-52] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying 
> hdfs://0.0.0.0:45000/home/restest/staging-dir/restest/.staging/job_1375199817609_0049/job_1375199817609_0049_2.jhist
>  to 
> hdfs://0.0.0.0:45000/home/restest/staging-dir/history/done_intermediate/restest/job_1375199817609_0049-1375435337429-restest-word+count-1375435511533-10-1-SUCCEEDED-a.jhist_tmp
> 2013-08-02 14:55:11,750 INFO [Thread-52] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done 
> location: 
> hdfs://0.0.0.0:45000/home/restest/staging-dir/history/done_intermediate/restest/job_1375199817609_0049-1375435337429-restest-word+count-1375435511533-10-1-SUCCEEDED-a.jhist_tmp
> 2013-08-02 14:55:11,769 INFO [Thread-52] 
> org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copying 
> hdfs://0.0.0.0:45000/home/restest/staging-dir/restest/.staging/job_1375199817609_0049/job_1375199817609_0049_2_conf.xml
>  to 
> hdfs://0.0.0.0:45000/home/restest/staging-dir/history/done_intermediate/restest/job_1375199817609_0049_conf.xml_tmp
> 2013-08-02 14:55:11,880 INFO [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before 
> Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:0 
> AssignedReds:1 CompletedMaps:10 CompletedReds:1 ContAlloc:1 ContRel:0 
> HostLocal:0 RackLocal:0
> 2013-08-02 14:55:13,649 ERROR [RMCommunicator Allocator] 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
> communicating with RM: Resource Manager doesn't recognize AttemptId: 
> application_1375199817609_0049
> org.apache.hadoop.yarn.YarnException: Resource Manager doesn't recognize 
> AttemptId: application_1375199817609_0049
>       at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:626)
>       at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:238)
>       at 
> org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:250)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-08-02 14:55:13,649 ERROR [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Can't handle this event 
> at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> JOB_AM_REBOOT at SUCCEEDED
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:914)
>       at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:129)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1114)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1110)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:130)
>       at 
> org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:309)
>       at 
> org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:305)
>       at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-08-02 14:55:13,652 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: JobHistoryEvent is 
> triggered from JobImpl
> 2013-08-02 14:55:13,652 INFO [AsyncDispatcher event handler] 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: 
> job_1375199817609_0049Job Transitioned from SUCCEEDED to ERROR
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-5444) MRAppMaster throws InvalidStateTransitonException: Invalid event: JOB_AM_REBOOT at SUCCEEDED

Reply via email to