[ 
https://issues.apache.org/jira/browse/YARN-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621049#comment-16621049
 ] 

Eric Yang commented on YARN-8665:
---------------------------------

[~csingh] Thank you for the patch.  When cancel upgrade is triggered, app 
master seems to reset all instance to state NEEDS_UPGRADE, and set service 
state to CANCEL_UPGRADING.  It becomes hard to identify if the instance should 
be restarted with original configuration.  I think we can avoid the step of 
reset NEEDS_UPGRADE state for all instances.  Instances at READY/FAILED_UPGRADE 
state should be marked for NEEDS_UPGRADE, and instances at NEEDS_UPGRADE state 
should be reset to RUNNING_BUT_NOT_READY to revert the process.  The inversion 
approach might work better to restore service to its original form without 
version control reinit process.

If you choose to stay on course of the current implementation, node manager 
report back to app master might need to introduce new versioning mechanism of 
the operation performed.  This helps to track if the reinit operation was 
performed for upgrade or upgrade cancel operation like you described as a 
separate JIRA.  However, I would feel more comfortable to solve the problem in 
this JIRA to make sure we don't destabilize the code base.

I also try to launch the app, and trigger upgrade with -initiate flag, then 
cancel with -cancel flag without actually upgrade any instance.  When this is 
performed, the service stuck in CANCEL_UPGRADING state without revert back to 
STABLE state.

> Yarn Service Upgrade:  Support cancelling upgrade
> -------------------------------------------------
>
>                 Key: YARN-8665
>                 URL: https://issues.apache.org/jira/browse/YARN-8665
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Chandni Singh
>            Assignee: Chandni Singh
>            Priority: Major
>         Attachments: YARN-8665.001.patch
>
>
> When a service is upgraded without auto-finalization or express upgrade, then 
> the upgrade can be cancelled. This provides the user ability to test upgrade 
> of a single instance and if that doesn't go well, they get a chance to cancel 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to