Karthik Palaniappan created SPARK-19941:
-------------------------------------------

             Summary: Spark should not schedule tasks on executors on 
decommissioning YARN nodes
                 Key: SPARK-19941
                 URL: https://issues.apache.org/jira/browse/SPARK-19941
             Project: Spark
          Issue Type: Bug
          Components: Scheduler, YARN
    Affects Versions: 2.2.0
         Environment: Hadoop 2.8.0-rc1
            Reporter: Karthik Palaniappan


Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in YARN: 
https://issues.apache.org/jira/browse/YARN-914

Essentially you can mark nodes to be decommissioned, and let them a) finish 
work in progress and b) finish serving shuffle data. But no new work will be 
scheduled on the node.

Spark should respect when NMs are set to decommissioned, and similarly 
decommission executors on those nodes by not scheduling any more tasks on them.

It looks like in the future YARN may inform the app master when containers will 
be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I don't 
think Spark should schedule based on a timeout. We should gracefully 
decommission the executor as fast as possible (which is the spirit of 
YARN-914). The app master can query the RM for NM statuses (if it doesn't 
already have them) and stop scheduling on executors on NMs that are 
decommissioning.

Stretch feature: The timeout may be useful in determining whether running 
further tasks on the executor is even helpful. Spark may be able to tell that 
shuffle data will not be consumed by the time the node is decommissioned, so it 
is not worth computing. The executor can be killed immediately.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to