[ https://issues.apache.org/jira/browse/YARN-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007024#comment-14007024 ]
Rohith commented on YARN-1365: ------------------------------ Hi Anubhav, One comment on the patch. * Notifying to scheduler for APP_ATTEMPT_ADDED is in RMApp lead to InvalidStateTranstion exception for RMAppAttept. Can this handle at RMAppAtteptImpl#AttemptRecoveredTransition?. Since during recovery of RMApp, all attempt are recovered in synchronously , so RMAppAttempt state is moved to LAUNCHED before notifying to scheduler. {noformat} // Let scheduler know about this attempt so it can allow AM to register boolean disableTransferState = false; app.handler.handle(new AppAttemptAddedSchedulerEvent(app.currentAttempt .getAppAttemptId(), disableTransferState)); {noformat} > ApplicationMasterService to allow Register and Unregister of an app that was > running before restart > --------------------------------------------------------------------------------------------------- > > Key: YARN-1365 > URL: https://issues.apache.org/jira/browse/YARN-1365 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Bikas Saha > Assignee: Anubhav Dhoot > Attachments: YARN-1365.001.patch, YARN-1365.002.patch, > YARN-1365.003.patch, YARN-1365.initial.patch > > > For an application that was running before restart, the > ApplicationMasterService currently throws an exception when the app tries to > make the initial register or final unregister call. These should succeed and > the RMApp state machine should transition to completed like normal. > Unregistration should succeed for an app that the RM considers complete since > the RM may have died after saving completion in the store but before > notifying the AM that the AM is free to exit. -- This message was sent by Atlassian JIRA (v6.2#6252)