[ 
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14561656#comment-14561656
 ] 

Sandy Ryza commented on SPARK-4352:
-----------------------------------

I have a couple concerns about that approach.  The first is that we can end up 
requesting many more executors on a node than we need.  E.g. if only a single 
task is interested in running on node1, but many tasks are interested in 
running on node2, nothing will stop YARN from scheduling all requested 
executors on node1.  Conversely, we can end up requesting far fewer executors 
on a node than we need.  Once a single executor has been placed on a node, even 
if that executor only has a single core and there are many pending tasks 
interested in that node, we will stop trying to place executors on that node. 

I have an idea for a scheme that I think will better tailor requests to 
locality needs.  It's definitely more complex, but I don't think it's more 
complex than what frameworks like MapReduce do, and I think it's an answer to a 
fundamentally complex problem.

The idea is basically to submit executor requests such that, for each node, 
sum(cores from all executor requests with that node on their preferred list) = 
number tasks that prefer that node.

As an example, let's imagine our executors have 2 cores each, we have 4 nodes 
named "a" through "d", and 30 pending tasks.  The first 20 of these pending 
tasks have locality preferences for nodes a, b, and c, and the other 10 prefer 
nodes a, b, and d.  Meaning that the number of tasks desired on each node are 
a: 30, b: 30, c: 20, and d: 10.

If the number of executors we want to request is = 15, then we would submit the 
following requests:
requests for 5 executors with nodes = <a, b, c, d>
requests for 5 executors with nodes = <a, b, c>
requests for 5 executors with nodes = <a, b>

If the number of executors we want to request is 7, then we would submit the 
following requests:
requests for 5 executors with nodes = <a, b, c, d>
requests for 2 executors with nodes = <a, b, c>

If the number of executors we want to request is 18, then we would submit the 
following requests:
requests for 5 executors with nodes = <a, b, c, d>
requests for 5 executors with nodes = <a, b, c>
requests for 5 executors with nodes = <a, b>
requests for 3 executors with no locality preferences

We might want to augment this to account for the fact that some nodes have 
executors already.

What do you think? 

> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
>                 Key: SPARK-4352
>                 URL: https://issues.apache.org/jira/browse/SPARK-4352
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.2.0
>            Reporter: Sandy Ryza
>            Assignee: Saisai Shao
>            Priority: Critical
>         Attachments: Supportpreferrednodelocationindynamicallocation.pdf
>
>
> Currently, achieving data locality in Spark is difficult unless an 
> application takes resources on every node in the cluster.  
> preferredNodeLocalityData provides a sort of hacky workaround that has been 
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to 
> demand from the application.  When this occurs, it would be useful to look at 
> the pending tasks and communicate their location preferences to the cluster 
> resource manager. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to