pgandhi999 commented on a change in pull request #23677: 
[SPARK-26755][SCHEDULER] : Optimize Spark Scheduler to dequeue speculative 
tasks…
URL: https://github.com/apache/spark/pull/23677#discussion_r303476604
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala
 ##########
 @@ -143,25 +144,18 @@ private[spark] class TaskSetManager(
   // of failures.
   // Duplicates are handled in dequeueTaskFromList, which ensures that a
   // task hasn't already started running before launching it.
-  private val pendingTasksForExecutor = new HashMap[String, ArrayBuffer[Int]]
 
-  // Set of pending tasks for each host. Similar to pendingTasksForExecutor,
-  // but at host level.
-  private val pendingTasksForHost = new HashMap[String, ArrayBuffer[Int]]
-
-  // Set of pending tasks for each rack -- similar to the above.
-  private val pendingTasksForRack = new HashMap[String, ArrayBuffer[Int]]
-
-  // Set containing pending tasks with no locality preferences.
-  private[scheduler] var pendingTasksWithNoPrefs = new ArrayBuffer[Int]
-
-  // Set containing all pending tasks (also used as a stack, as above).
-  private val allPendingTasks = new ArrayBuffer[Int]
+  private[scheduler] val pendingTasks = new PendingTasksByLocality()
 
   // Tasks that can be speculated. Since these will be a small fraction of 
total
-  // tasks, we'll just hold them in a HashSet.
+  // tasks, we'll just hold them in a HashSet. The HashSet here ensures that 
we do not add
+  // duplicate speculative tasks.
   private[scheduler] val speculatableTasks = new HashSet[Int]
 
 Review comment:
   @Ngone51 that is a valid question indeed. This change was reasoned about due 
to some jobs we saw on the cluster in which speculation was heavily kicking in 
due to the fact that the cluster during those points in time **was healthy but 
was really busy**. As I have documented the findings in 
https://issues.apache.org/jira/browse/SPARK-26755, you can see that the 
task-result-getter threads were more than often in BLOCKED state and job 
progress was slow.
    The HashSet `speculatableTasks` was still holding a small fraction of the 
tasks(at most 10 to 15%) but every call to `dequeueSpeculativeTask` would end 
up looping through each task 5 different times in the worst case thus, leading 
to unnecessary traversing and thread blocks. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to