Lijie Xu created SPARK-17325: -------------------------------- Summary: Inconsistent Spillable threshold and AppendOnlyMap growing threshold may trigger out-of-memory errors Key: SPARK-17325 URL: https://issues.apache.org/jira/browse/SPARK-17325 Project: Spark Issue Type: Bug Components: Shuffle, Spark Core Affects Versions: 2.0.0, 1.6.2 Reporter: Lijie Xu
I am reading the shuffle source code and guessing that there may be a potential out-of-memory error in ExternalSorter. The problem is that the memory usage of AppendOnlyMap (i.e., PartitionedAppendOnlyMap in ExternalSorter) can greatly exceed its spillable threshold (i.e., `currentMemory` can be 2 times the size of `myMemoryThreshold` in `Spillable.maybeSpill()`). This means that the task's current execution memory usage (AppendOnlyMap) has greatly exceeded its defined execution memory limit ((1 - spark.memory.storageFraction) * 1 / #taskNum), which will lead to potential out-of-memory errors. Example: Current spillable threshold has become 250MB, while the AppendOnlyMap is 200MB. At this time, an incoming key/value record triggers AppendOnlyMap's size expansion (AppendOnlyMap is full). After expansion, the AppendOnlyMap may become 400MB (or slightly smaller), which is greatly larger than the spillable threshold and execution memory limit. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org