Deegue created YARN-11095: ----------------------------- Summary: [Umbrella] Node load based scheduler Key: YARN-11095 URL: https://issues.apache.org/jira/browse/YARN-11095 Project: Hadoop YARN Issue Type: Improvement Reporter: Deegue
Node load based scheduler is quite effective for cluster stability, epecially when we deploy NodeManager, DataNode and use Auxservices like mapreduce shuffle or spark shuffle. We can set up threshold and auto skip the nodes with high load when scheduling. Node load should mainly focus on CPU, Memory and DiskIO. Keeping CPU and Memory under a healthy threshold makes container and task time more stable, reduces the possibility of OOM kill by OS. As for DiskIO, high disk load will more likely cause slow DataNode and fetch failure when shuffling data. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org