[jira] [Commented] (SPARK-2294) TaskSchedulerImpl and TaskSetManager do not properly prioritize which tasks get assigned to an executor

2014-07-06 Thread Nan Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053329#comment-14053329
 ] 

Nan Zhu commented on SPARK-2294:


PR: https://github.com/apache/spark/pull/1313

 TaskSchedulerImpl and TaskSetManager do not properly prioritize which tasks 
 get assigned to an executor
 ---

 Key: SPARK-2294
 URL: https://issues.apache.org/jira/browse/SPARK-2294
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0, 1.0.1
Reporter: Kay Ousterhout
Assignee: Nan Zhu

 If an executor E is free, a task may be speculatively assigned to E when 
 there are other tasks in the job that have not been launched (at all) yet.  
 Similarly, a task without any locality preferences may be assigned to E when 
 there was another NODE_LOCAL task that could have been scheduled. 
 This happens because TaskSchedulerImpl calls TaskSetManager.resourceOffer 
 (which in turn calls TaskSetManager.findTask) with increasing locality 
 levels, beginning with PROCESS_LOCAL, followed by NODE_LOCAL, and so on until 
 the highest currently allowed level.  Now, supposed NODE_LOCAL is the highest 
 currently allowed locality level.  The first time findTask is called, it will 
 be called with max level PROCESS_LOCAL; if it cannot find any PROCESS_LOCAL 
 tasks, it will try to schedule tasks with no locality preferences or 
 speculative tasks.  As a result, speculative tasks or tasks with no 
 preferences may be scheduled instead of NODE_LOCAL tasks.
 cc [~matei]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-2294) TaskSchedulerImpl and TaskSetManager do not properly prioritize which tasks get assigned to an executor

2014-06-26 Thread Mridul Muralidharan (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045433#comment-14045433
 ] 

Mridul Muralidharan commented on SPARK-2294:


I agree; We should bump no locality pref and speculative tasks to NODE_LOCAL 
level after NODE_LOCAL tasks have been scheduled (if available), and not check 
for them at PROCESS_LOCAL max locality. So they get scheduled before RACK_LOCAL 
but after NODE_LOCAL.
This is an artifact of the design when there was no PROCESS_LOCAL and 
NODE_LOCAL was the best schedule possible (without explicitly having these 
level : we had node and any).

 TaskSchedulerImpl and TaskSetManager do not properly prioritize which tasks 
 get assigned to an executor
 ---

 Key: SPARK-2294
 URL: https://issues.apache.org/jira/browse/SPARK-2294
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.0, 1.0.1
Reporter: Kay Ousterhout

 If an executor E is free, a task may be speculatively assigned to E when 
 there are other tasks in the job that have not been launched (at all) yet.  
 Similarly, a task without any locality preferences may be assigned to E when 
 there was another NODE_LOCAL task that could have been scheduled. 
 This happens because TaskSchedulerImpl calls TaskSetManager.resourceOffer 
 (which in turn calls TaskSetManager.findTask) with increasing locality 
 levels, beginning with PROCESS_LOCAL, followed by NODE_LOCAL, and so on until 
 the highest currently allowed level.  Now, supposed NODE_LOCAL is the highest 
 currently allowed locality level.  The first time findTask is called, it will 
 be called with max level PROCESS_LOCAL; if it cannot find any PROCESS_LOCAL 
 tasks, it will try to schedule tasks with no locality preferences or 
 speculative tasks.  As a result, speculative tasks or tasks with no 
 preferences may be scheduled instead of NODE_LOCAL tasks.
 cc [~matei]



--
This message was sent by Atlassian JIRA
(v6.2#6252)