[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

GitBox Tue, 31 May 2022 06:34:11 -0700


Ngone51 commented on code in PR #36162:
URL: https://github.com/apache/spark/pull/36162#discussion_r885645567



##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -853,8 +857,11 @@ private[spark] class TaskSchedulerImpl(
     // (taskId, stageId, stageAttemptId, accumUpdates)
     val accumUpdatesWithTaskIds: Array[(Long, Int, Int, Seq[AccumulableInfo])] 
= {
       accumUpdates.flatMap { case (id, updates) =>
-        val accInfos = updates.map(acc => acc.toInfo(Some(acc.value), None))
         Option(taskIdToTaskSetManager.get(id)).map { taskSetMgr =>
+          val (accInfos, taskProgressRate) = 
getTaskAccumulableInfosAndProgressRate(updates)

Review Comment:
   I'm a bit worried about the scheduler's throughput if our concerns on the 
accumulators' traverse efficiency matter. I still think we could only traverse 
inside the speculation thread to decouple with the scheduling thread. If we 
move this stuff to the speculation thread, we can also avoid unnecessary 
traverses since it's only necessary when `checkSpeculatableTasks` requires, 
while with the current implementation it traverses for each heartbeat update 
and successful task completion.
   
   
   If we want to move it to the speculation thread, the implementation could be 
also a bit simpler. At `TaskSchedulerImpl.executorHeartbeatReceived()`, we 
should only set `_accumulables`. And we don't need to set `_accumulables` by 
us, which is already covered by `DAGScheudler.updateAccumulators()`. Then, we'd 
only need to focus on the calculation/traverses at `InefficientTaskCalculator`. 
It might be a bit slow for the first-time traverses but we can cache the 
records/runtime for the finished tasks or progress rate for the running tasks. 
And even if it's slow, I think it's still better compared to slow the 
scheduling threads. 
   
   @weixiuli @mridulm WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a diff in pull request #36162: [SPARK-32170][CORE] Improve the speculation through the stage task metrics.

Reply via email to