Lijie Xu created SPARK-17325:
--------------------------------

             Summary: Inconsistent Spillable threshold and AppendOnlyMap 
growing threshold may trigger out-of-memory errors
                 Key: SPARK-17325
                 URL: https://issues.apache.org/jira/browse/SPARK-17325
             Project: Spark
          Issue Type: Bug
          Components: Shuffle, Spark Core
    Affects Versions: 2.0.0, 1.6.2
            Reporter: Lijie Xu


I am reading the shuffle source code and guessing that there may be a potential 
out-of-memory error in ExternalSorter.

The problem is that the memory usage of AppendOnlyMap (i.e., 
PartitionedAppendOnlyMap in ExternalSorter) can greatly exceed its spillable 
threshold (i.e., `currentMemory` can be 2 times the size of `myMemoryThreshold` 
in `Spillable.maybeSpill()`). This means that the task's current execution 
memory usage (AppendOnlyMap) has greatly exceeded its defined execution memory 
limit ((1 - spark.memory.storageFraction) * 1 / #taskNum), which will lead to 
potential out-of-memory errors.


Example: Current spillable threshold has become 250MB, while the AppendOnlyMap 
is 200MB. At this time, an incoming key/value record triggers AppendOnlyMap's 
size expansion (AppendOnlyMap is full). After expansion, the AppendOnlyMap may 
become 400MB (or slightly smaller), which is greatly larger than the spillable 
threshold and execution memory limit.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to