[
https://issues.apache.org/jira/browse/YARN-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051964#comment-18051964
]
yanbin.zhang commented on YARN-11466:
-------------------------------------
[~prabhujoseph] Hello, have you ever implemented this idea?
> Graceful Decommission for Shuffle Services
> ------------------------------------------
>
> Key: YARN-11466
> URL: https://issues.apache.org/jira/browse/YARN-11466
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Prabhu Joseph
> Assignee: Prabhu Joseph
> Priority: Major
>
> Currently, YARN Graceful Decommission waits for the completion of both
> running containers and the running applications
> (https://issues.apache.org/jira/browse/YARN-9608) of those containers
> launched on the node under decommission. This adds unnecessary huge cost to
> users on cloud deployments as most of the idle nodes are under decommission
> waiting for the running application to complete.
> This feature aims to improve the Graceful Decommission logic by waiting for
> the actual shuffle data to be consumed by dependent tasks rather than the
> entire application. Below is the high-level design I have in mind.
> Add a new interface (say AuxiliaryShuffleService extends AuxiliaryService)
> through which the workloads (Spark, Tez, MapReduce) ShuffleHandler exposes
> shuffle data metrics (like shuffle data being present or not). NodeManager
> periodically collects the shuffle data metrics from the configured
> AuxiliaryShuffleServices and sends them along with the heartbeat to the
> ResourceManager. The graceful decommission logic runs inside ResourceManager
> waits until the shuffle data is consumed, with a maximum wait time up to the
> configured graceful decommission timeout.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]