[ https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075117#comment-14075117 ]
Apache Spark commented on SPARK-2696: ------------------------------------- User 'falaki' has created a pull request for this issue: https://github.com/apache/spark/pull/1595 > Reduce default spark.serializer.objectStreamReset > -------------------------------------------------- > > Key: SPARK-2696 > URL: https://issues.apache.org/jira/browse/SPARK-2696 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.0.0 > Reporter: Hossein Falaki > Labels: configuration > Fix For: 1.1.0 > > > The current default value of spark.serializer.objectStreamReset is 10,000. > When trying to re-partition (e.g., to 64 partitions) a large file (e.g., > 500MB), containing 1MB records, the serializer will cache 10000 x 1MB x 64 = > 640 GB which will cause it to go out of memory. > We think 100 would be a more reasonable default value for this configuration > parameter. -- This message was sent by Atlassian JIRA (v6.2#6252)