L. C. Hsieh created SPARK-30379:
-----------------------------------

             Summary: Avoid OOM when using collection accumulator
                 Key: SPARK-30379
                 URL: https://issues.apache.org/jira/browse/SPARK-30379
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 3.0.0
            Reporter: L. C. Hsieh
            Assignee: L. C. Hsieh


One Spark job on our cluster uses collection accumulator to collect something 
and has encountered an exception like:

```
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
    at java.lang.StringBuilder.append(StringBuilder.java:136)
    at java.lang.StringBuilder.append(StringBuilder.java:131)
    at java.util.AbstractCollection.toString(AbstractCollection.java:462)
    at 
java.util.Collections$UnmodifiableCollection.toString(Collections.java:1035)
    at 
org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2$$anonfun$apply$3.apply(LiveEntity.scala:596)
    at 
org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2$$anonfun$apply$3.apply(LiveEntity.scala:596)
    at scala.Option.map(Option.scala:146)
    at 
org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2.apply(LiveEntity.scala:596)
    at 
org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2.apply(LiveEntity.scala:591)
```

`LiveEntityHelpers.newAccumulatorInfos` converts `AccumulableInfo`s to 
`v1.AccumulableInfo` by calling `toString` on accumulator's value. For 
collection accumulator, it might take much more memory when in string 
representation, for example, collection accumulator of long values, and cause 
OOM (in this job, the driver memory is 6g).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to