[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052103#comment-14052103 ]
Rui Li commented on SPARK-2277: ------------------------------- With [PR #892|https://github.com/apache/spark/pull/892], we'll check if a task's preference is available when adding it to pending lists. TaskScheduler tracks information about executor/host, so that TaskSetManager can check if the preferred executor/host is available. TaskScheduler also provides getRackForHost to get the corresponding rack for a host (currently only returns None). I think this is some prior acquired knowledge about the cluster topology, which does not indicate whether there's any host on that rack granted to this spark app. Therefore we don't know the availability of the preferred rack. > Make TaskScheduler track whether there's host on a rack > ------------------------------------------------------- > > Key: SPARK-2277 > URL: https://issues.apache.org/jira/browse/SPARK-2277 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 1.0.0 > Reporter: Rui Li > > When TaskSetManager adds a pending task, it checks whether the tasks's > preferred location is available. Regarding RACK_LOCAL task, we consider the > preferred rack available if such a rack is defined for the preferred host. > This is incorrect as there may be no alive hosts on that rack at all. > Therefore, TaskScheduler should track the hosts on each rack, and provides an > API for TaskSetManager to check if there's host alive on a specific rack. -- This message was sent by Atlassian JIRA (v6.2#6252)