[ https://issues.apache.org/jira/browse/SPARK-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Saisai Shao closed SPARK-8424. ------------------------------ Resolution: Duplicate > Add blacklist mechanism for task scheduler and Yarn container allocation > ------------------------------------------------------------------------ > > Key: SPARK-8424 > URL: https://issues.apache.org/jira/browse/SPARK-8424 > Project: Spark > Issue Type: New Feature > Components: Scheduler, YARN > Affects Versions: 1.4.0 > Reporter: Saisai Shao > > Previously MapReduce has a blacklist and graylist to exclude some constantly > failed TaskTrackers/nodes, it is important for a large cluster to alleviate > the problem of increasing chance of hardware and software failure. > Unfortunately current version of Spark lacks such mechanism to blacklist some > constantly failed executors/nodes. The only blacklist mechanism in Spark is > to avoid relaunching the task on the same executor when this task is > previously failed on this executor within specified time. So here propose a > new feature to add blacklist mechanism for Spark, this proposal is divided > into two sub-tasks: > 1. Add a heuristic blacklist algorithm to track the status of executors by > the status of finished tasks, and enable blacklist mechanism in tasking > scheduling. > 2. Enable blacklist mechanism in YARN container allocation (avoid allocating > containers on the blacklist hosts). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org