Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/16867#discussion_r104508618 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -740,6 +743,7 @@ private[spark] class TaskSetManager( } removeRunningTask(tid) info.markFinished(state) + successfulTaskDurations.remove(taskInfos(tid).duration) --- End diff -- I wouldn't bother with this, because the remove method is complicated (so you can eliminate it without this line of code), and the median is good because it's not very sensitive to outlier values, so I think it's fine to leave failed tasks in. Also, I'm not sure this code is correct, because it's possible that the duration hadn't yet been added to successfulTaskDurations, in which case you could be removing the duration that happens to be the same, but actually corresponds to a successful task.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org