[ 
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050951#comment-14050951
 ] 

Rui Li commented on SPARK-2277:
-------------------------------

Suppose task1 prefers node1 but node1 is not available at the moment. However, 
we know node1 is on rack1, which makes task1 prefers rack1 for RACK_LOCAL 
locality. The problem is, we don't know if there's alive host on rack1, so we 
cannot check the availability of this preference.
Please let me know if I misunderstand anything :)

> Make TaskScheduler track whether there's host on a rack
> -------------------------------------------------------
>
>                 Key: SPARK-2277
>                 URL: https://issues.apache.org/jira/browse/SPARK-2277
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Rui Li
>
> When TaskSetManager adds a pending task, it checks whether the tasks's 
> preferred location is available. Regarding RACK_LOCAL task, we consider the 
> preferred rack available if such a rack is defined for the preferred host. 
> This is incorrect as there may be no alive hosts on that rack at all. 
> Therefore, TaskScheduler should track the hosts on each rack, and provides an 
> API for TaskSetManager to check if there's host alive on a specific rack.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to