[ https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050951#comment-14050951 ]
Rui Li commented on SPARK-2277: ------------------------------- Suppose task1 prefers node1 but node1 is not available at the moment. However, we know node1 is on rack1, which makes task1 prefers rack1 for RACK_LOCAL locality. The problem is, we don't know if there's alive host on rack1, so we cannot check the availability of this preference. Please let me know if I misunderstand anything :) > Make TaskScheduler track whether there's host on a rack > ------------------------------------------------------- > > Key: SPARK-2277 > URL: https://issues.apache.org/jira/browse/SPARK-2277 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 1.0.0 > Reporter: Rui Li > > When TaskSetManager adds a pending task, it checks whether the tasks's > preferred location is available. Regarding RACK_LOCAL task, we consider the > preferred rack available if such a rack is defined for the preferred host. > This is incorrect as there may be no alive hosts on that rack at all. > Therefore, TaskScheduler should track the hosts on each rack, and provides an > API for TaskSetManager to check if there's host alive on a specific rack. -- This message was sent by Atlassian JIRA (v6.2#6252)