[ https://issues.apache.org/jira/browse/SPARK-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matei Zaharia updated SPARK-2538: --------------------------------- Priority: Critical (was: Major) > External aggregation in Python > ------------------------------ > > Key: SPARK-2538 > URL: https://issues.apache.org/jira/browse/SPARK-2538 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 1.0.0, 1.0.1 > Reporter: Davies Liu > Assignee: Davies Liu > Priority: Critical > Labels: pyspark > Fix For: 1.0.0, 1.0.1 > > Original Estimate: 72h > Remaining Estimate: 72h > > For huge reduce tasks, user will got out of memory exception when all the > data can not fit in memory. > It should put some of the data into disks and then merge them together, just > like what we do in Scala. -- This message was sent by Atlassian JIRA (v6.2#6252)