Saisai Shao created SPARK-8424:
----------------------------------

             Summary: Add blacklist mechanism for task scheduler and Yarn 
container allocation
                 Key: SPARK-8424
                 URL: https://issues.apache.org/jira/browse/SPARK-8424
             Project: Spark
          Issue Type: New Feature
          Components: Scheduler, YARN
    Affects Versions: 1.4.0
            Reporter: Saisai Shao


Previously MapReduce has  a blacklist and graylist to exclude some constantly 
failed TaskTrackers/nodes, it is important for a large cluster to alleviate the 
problem of  increasing chance of hardware and software failure. Unfortunately 
current version of Spark lacks such mechanism to blacklist some constantly 
failed executors/nodes. The only blacklist mechanism in Spark is to avoid 
relaunching the task on the same executor when this task is previously failed 
on this executor within specified time. So here propose a new feature to add 
blacklist mechanism for Spark, this proposal is divided into two sub-tasks:

1. Add a heuristic blacklist algorithm to track the status of executors by the 
status of finished tasks, and enable blacklist mechanism in tasking scheduling.
2. Enable blacklist mechanism in YARN container allocation (avoid allocating 
containers on the blacklist hosts).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to