[ 
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558018#comment-14558018
 ] 

Saisai Shao commented on SPARK-4352:
------------------------------------

Hi [~sandyr], currently my solution of launching executors is to cover the 
preferred locations of pending task as large as possible, this solution is a 
little different from the old code. 

The locality preference will be maintained in ApplicationMaster and initialized 
as *false*, which means currently there's no executor on this node, if executor 
is launched, then this host will be set as *true*, which is a hint for 
ContainerRequest to avoid reassign the container on the same node, and assign 
the container to the unused locality as possible. Newly submitted tasks' node 
locality will also be transferred to ApplicationMaster in the run-time, 
ApplicationMaster will update and merge the new node localities (node where 
already has executors will not be updated to *false* state to request a new 
container). Also when the container is killed or completed, locality will be 
removed which means we don't need this locality any more.

This algorithm tries to cover the node locality as much as possible, and for 
dynamic new container request, try not to allocate container on the node which 
already has one, try to cover the localities as possible. This algorithm has 
some pros and cons:

Pros:
1. with default yarn mode (dynamic allocation disabled, executor number 
specified, no preferred node location), this algorithm has no impact on the old 
logic, the logic is exact the same.
2. with dynamic allocation enabled and not preferred node location, this 
algorithm also keep the same semantics has previous code.
3. with dynamic allocation enabled and preferred node location exists, this 
algorithm will try to cover the range of locality as much as possible.

Cons:
1. This algorithm do not consider the task numbers distribution, if some nodes 
have more task then other nodes, this algorithm will not assign more containers 
on these node (preferred locality coverage is the first priority).
2. In the run-time of dynamic allocation, this algorithm will not kill the 
container to reassign the container to keep the best locality, so at extreme 
situation, locality will not be matched at all if the current container number 
is satisfied.

So what is your suggestion?

> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
>                 Key: SPARK-4352
>                 URL: https://issues.apache.org/jira/browse/SPARK-4352
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.2.0
>            Reporter: Sandy Ryza
>            Priority: Critical
>
> Currently, achieving data locality in Spark is difficult unless an 
> application takes resources on every node in the cluster.  
> preferredNodeLocalityData provides a sort of hacky workaround that has been 
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to 
> demand from the application.  When this occurs, it would be useful to look at 
> the pending tasks and communicate their location preferences to the cluster 
> resource manager. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to