Abhishek Dixit created YARN-11421:
-------------------------------------

             Summary: Graceful Decommission ignores launched containers and 
gets deactivated before timeout
                 Key: YARN-11421
                 URL: https://issues.apache.org/jira/browse/YARN-11421
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
    Affects Versions: 3.3.4
            Reporter: Abhishek Dixit


During Graceful Decommission, a Node gets deactivated before timeout even 
though there are launched containers on that node.

We have observed cases when graceful decommission signal is sent to node and 
Containers are launched at NodeManager and at the same time,  in such cases 
ResourceManager moves the node from Decommissioning to Decommissioned state 
because launced containers are not checked in DeactivateNodeTransition.

We will suggest using a MultiArc transition instead of DeactivateNodeTransition 
which checks for AM containers from the scheduler and then decides whether to 
keep the node in Decommissioning state or move it to Decommissioned State.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to