[ 
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551836#comment-14551836
 ] 

Saisai Shao commented on SPARK-4352:
------------------------------------

Hi Sandy, I retrieved back the old code which supports preferredNodeLocations 
in yarn, it takes task distribution into consideration by 
{{generateNodeToWeight}}, it addresses some questions you mentioned above, but 
I think it is hard to apply such mechanism in dynamic allocation.

If we already have 3 containers, we request 1 more container with a list of new 
preferred locality, do we need to kill all the old containers to re-request 
containers based on the new preferred locality? If so, the overhead will be 
high; if not, the locality will not be optimal. 

So we could only try to compute the partial optimal allocation strategy, it is 
hard to maintain a global optimal strategy.

Sorry for my immature consideration, I will rethink my design and improve it.

> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
>                 Key: SPARK-4352
>                 URL: https://issues.apache.org/jira/browse/SPARK-4352
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.2.0
>            Reporter: Sandy Ryza
>            Priority: Critical
>
> Currently, achieving data locality in Spark is difficult unless an 
> application takes resources on every node in the cluster.  
> preferredNodeLocalityData provides a sort of hacky workaround that has been 
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to 
> demand from the application.  When this occurs, it would be useful to look at 
> the pending tasks and communicate their location preferences to the cluster 
> resource manager. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to