Josh Rosen created SPARK-18553: ---------------------------------- Summary: Executor loss may cause TaskSetManager to be leaked Key: SPARK-18553 URL: https://issues.apache.org/jira/browse/SPARK-18553 Project: Spark Issue Type: Bug Components: Scheduler Affects Versions: 2.0.0, 1.6.0, 2.1.0 Reporter: Josh Rosen Assignee: Josh Rosen Priority: Blocker
Due to a bug in TaskSchedulerImpl, the complete sudden loss of an executor may cause a TaskSetManager to be leaked, causing ShuffleDependencies and other data structures to be kept alive indefinitely, leading to various types of resource leaks (including shuffle file leaks). In a nutshell, the problem is that TaskSchedulerImpl did not maintain its own mapping from executorId to running task ids, leaving it unable to clean up taskId to taskSetManager maps when an executor is totally lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org