Tao Li created SPARK-6737: ----------------------------- Summary: OutputCommitCoordinator.authorizedCommittersByStage map out of memory Key: SPARK-6737 URL: https://issues.apache.org/jira/browse/SPARK-6737 Project: Spark Issue Type: Bug Components: Spark Core, Streaming Affects Versions: 1.3.0 Environment: spark 1.3.1 Reporter: Tao Li Priority: Critical
I am using spark streaming(1.3.1) as a long time running service and out of memory after running for 7 weeks. I found that the field authorizedCommittersByStage in OutputCommitCoordinator class cause the OOM. authorizedCommittersByStage is a map, key is StageId, value is Map[PartitionId, TaskAttemptId]. The OutputCommitCoordinator class has a method stageEnd which will remove stageId from authorizedCommittersByStage. But the method stageEnd is never called by DAGSchedule. And it cause the authorizedCommittersByStage's stage info never be cleaned, which cause OOM. It happens in my spark streaming program(1.3.1), I am not sure if it will appear in other spark components and other spark version. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org