LantaoJin commented on a change in pull request #23951: 
[SPARK-27038][CORE][YARN] Re-implement RackResolver to reduce resolving time
URL: https://github.com/apache/spark/pull/23951#discussion_r263670931
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala
 ##########
 @@ -184,11 +184,23 @@ private[spark] class TaskSetManager(
     t.epoch = epoch
   }
 
+  // An array to store preferred location and its task index
+  private val locationWithTaskIndex: ArrayBuffer[(String, Int)] = new 
ArrayBuffer[(String, Int)]()
+  private val addTaskStartTime = System.nanoTime()
   // Add all our tasks to the pending lists. We do this in reverse order
   // of task index so that tasks with low indices get launched first.
   for (i <- (0 until numTasks).reverse) {
-    addPendingTask(i)
+    addPendingTask(i, true)
   }
+  // Convert preferred location list to rack list in one invocation and zip 
with the origin index
+  private val rackWithTaskIndex = 
sched.getRacksForHosts(locationWithTaskIndex.map(_._1).toList)
 
 Review comment:
   > The de-duping thing is minor, but I am concerned that the 
`locationWithTaskIndex` variable is going to be confusing if its left around as 
a private member variable, even though its only meaningful in this limited 
context.
   
   Yes, I don't want to do any de-duping here. I will refactor this part for 
getting more readable.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to