[ https://issues.apache.org/jira/browse/SPARK-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matei Zaharia resolved SPARK-2538. ---------------------------------- Resolution: Fixed Fix Version/s: (was: 1.0.1) (was: 1.0.0) 1.1.0 > External aggregation in Python > ------------------------------ > > Key: SPARK-2538 > URL: https://issues.apache.org/jira/browse/SPARK-2538 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 1.0.0, 1.0.1 > Reporter: Davies Liu > Assignee: Davies Liu > Priority: Critical > Labels: pyspark > Fix For: 1.1.0 > > Original Estimate: 72h > Remaining Estimate: 72h > > For huge reduce tasks, user will got out of memory exception when all the > data can not fit in memory. > It should put some of the data into disks and then merge them together, just > like what we do in Scala. -- This message was sent by Atlassian JIRA (v6.2#6252)