[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-10-25 Thread Jerry Lam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973531#comment-14973531 ] Jerry Lam commented on SPARK-8597: -- FYI ... The solution described here solves the problem of memory

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-30 Thread Vlad Ionescu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608791#comment-14608791 ] Vlad Ionescu commented on SPARK-8597: - I did some stress tests, the main purpose was

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14608806#comment-14608806 ] Reynold Xin commented on SPARK-8597: We are implementing a Tungsten version of

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-29 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606561#comment-14606561 ] Matt Cheah commented on SPARK-8597: --- I'm also concerned about the possibility that using

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-29 Thread Vlad Ionescu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606554#comment-14606554 ] Vlad Ionescu commented on SPARK-8597: - Actually I've used an

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-29 Thread Vlad Ionescu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606199#comment-14606199 ] Vlad Ionescu commented on SPARK-8597: - Hi, Regarding the third suggestion, the

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606248#comment-14606248 ] Michael Armbrust commented on SPARK-8597: - We could use Spark's external sort

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-26 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603816#comment-14603816 ] Matt Cheah commented on SPARK-8597: --- Cool, a coworker and I think we have something

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-25 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14601687#comment-14601687 ] Matt Cheah commented on SPARK-8597: --- I did some more digging. The memory space is taken

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602009#comment-14602009 ] Michael Armbrust commented on SPARK-8597: - Parquet allocates fairly large buffers

[jira] [Commented] (SPARK-8597) DataFrame partitionBy memory pressure scales extremely poorly

2015-06-24 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14599977#comment-14599977 ] Matt Cheah commented on SPARK-8597: --- I've attached the CSV file used in the test.