mridulm commented on code in PR #44321:
URL: https://github.com/apache/spark/pull/44321#discussion_r1430656022


##########
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala:
##########
@@ -1086,7 +1133,7 @@ private[spark] class TaskSetManager(
           addPendingTask(index)
           // Tell the DAGScheduler that this task was resubmitted so that it 
doesn't think our
           // stage finishes when a total of tasks.size tasks finish.
-          sched.dagScheduler.taskEnded(
+          emptyTaskInfoAccumulablesAndNotifyDagScheduler(tid,
             tasks(index), Resubmitted, null, Seq.empty, Array.empty, info)

Review Comment:
   Yes, what @JoshRosen [detailed for 
executorLost](https://github.com/apache/spark/pull/44321#pullrequestreview-1785137821)
 is what I am referring to - had missed that comment, thanks for the analysis 
Josh !
   
   Fixing memory issue is definitely something we should aspire for, but IMO 
not at the cost of breaking interfaces and existing user code - in this case, 
`Resubmitted` has existed for 8+ years; and has been exposed to developer's.
   How commonly this is triggered depends on stability of the cluster infra - 
which might be different for different users/deployments :-)
   
   In this specific case, unfortunately it impacts not just end user code 
(which we dont know is using it: and I have seen fairly involved usage of these 
api's, including writing some of my own in the past) - but also our spark UI - 
for example `AppStatusListener.onTaskEnd` when `Resubmitted` is fired.
   
   If we want to relook at the behavior of how `Resubmitted` is handled in 
Spark, we should do that decoupled from a performance discussion - it is 
possible that some of this has evolved in "interesting ways" since initially 
designed, and requires to be revisited.
   For this PR, which is to improve memory utilization, it should not have 
unintended side effects IMO.
   
   As I mentioned before, at a minimum, this should not become default.
   Once we relook/redesign `Resubmitted`, we can flip it - if required.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to