Deegue created YARN-11095:
-----------------------------

             Summary: [Umbrella] Node load based scheduler
                 Key: YARN-11095
                 URL: https://issues.apache.org/jira/browse/YARN-11095
             Project: Hadoop YARN
          Issue Type: Improvement
            Reporter: Deegue


Node load based scheduler is quite effective for cluster stability, epecially 
when we deploy NodeManager, DataNode and use Auxservices like mapreduce shuffle 
or spark shuffle.

We can set up threshold and auto skip the nodes with high load when scheduling.
Node load should mainly focus on CPU, Memory and DiskIO.

Keeping CPU and Memory under a healthy threshold makes container and task time 
more stable, reduces the possibility of OOM kill by OS. As for DiskIO, 
high disk load will more likely cause slow DataNode and fetch failure when 
shuffling data.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to