[jira] [Resolved] (SPARK-19941) Spark should not schedule tasks on executors on decommissioning YARN nodes

2019-02-12 Thread Marcelo Vanzin (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-19941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin resolved SPARK-19941.

Resolution: Duplicate

> Spark should not schedule tasks on executors on decommissioning YARN nodes
> --
>
> Key: SPARK-19941
> URL: https://issues.apache.org/jira/browse/SPARK-19941
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler, YARN
>Affects Versions: 2.1.0
> Environment: Hadoop 2.8.0-rc1
>Reporter: Karthik Palaniappan
>Priority: Major
>
> Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in 
> YARN: https://issues.apache.org/jira/browse/YARN-914
> Essentially you can mark nodes to be decommissioned, and let them a) finish 
> work in progress and b) finish serving shuffle data. But no new work will be 
> scheduled on the node.
> Spark should respect when NMs are set to decommissioned, and similarly 
> decommission executors on those nodes by not scheduling any more tasks on 
> them.
> It looks like in the future YARN may inform the app master when containers 
> will be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I 
> don't think Spark should schedule based on a timeout. We should gracefully 
> decommission the executor as fast as possible (which is the spirit of 
> YARN-914). The app master can query the RM for NM statuses (if it doesn't 
> already have them) and stop scheduling on executors on NMs that are 
> decommissioning.
> Stretch feature: The timeout may be useful in determining whether running 
> further tasks on the executor is even helpful. Spark may be able to tell that 
> shuffle data will not be consumed by the time the node is decommissioned, so 
> it is not worth computing. The executor can be killed immediately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-19941) Spark should not schedule tasks on executors on decommissioning YARN nodes

2017-03-26 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-19941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-19941.
---
Resolution: Won't Fix

> Spark should not schedule tasks on executors on decommissioning YARN nodes
> --
>
> Key: SPARK-19941
> URL: https://issues.apache.org/jira/browse/SPARK-19941
> Project: Spark
>  Issue Type: Improvement
>  Components: Scheduler, YARN
>Affects Versions: 2.1.0
> Environment: Hadoop 2.8.0-rc1
>Reporter: Karthik Palaniappan
>
> Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in 
> YARN: https://issues.apache.org/jira/browse/YARN-914
> Essentially you can mark nodes to be decommissioned, and let them a) finish 
> work in progress and b) finish serving shuffle data. But no new work will be 
> scheduled on the node.
> Spark should respect when NMs are set to decommissioned, and similarly 
> decommission executors on those nodes by not scheduling any more tasks on 
> them.
> It looks like in the future YARN may inform the app master when containers 
> will be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I 
> don't think Spark should schedule based on a timeout. We should gracefully 
> decommission the executor as fast as possible (which is the spirit of 
> YARN-914). The app master can query the RM for NM statuses (if it doesn't 
> already have them) and stop scheduling on executors on NMs that are 
> decommissioning.
> Stretch feature: The timeout may be useful in determining whether running 
> further tasks on the executor is even helpful. Spark may be able to tell that 
> shuffle data will not be consumed by the time the node is decommissioned, so 
> it is not worth computing. The executor can be killed immediately.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org