[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14948943#comment-14948943 ] Glenn Strycker commented on SPARK-11004: True, fixing the 2GB will go a long way. However, this

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14948920#comment-14948920 ] Sean Owen commented on SPARK-11004: --- Spark has had a sort-based shuffle for a while, which is a lot of

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949060#comment-14949060 ] Sean Owen commented on SPARK-11004: --- Literally run a Mapper and Reducer on Spark? I think it would be

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949076#comment-14949076 ] Glenn Strycker commented on SPARK-11004: Currently we could do the following from withing a linux

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949171#comment-14949171 ] Glenn Strycker commented on SPARK-11004: So maybe we can simplify this idea down to forcing

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949176#comment-14949176 ] Sean Owen commented on SPARK-11004: --- I suppose I'd be surprised if using disk over memory helped, but

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Glenn Strycker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949187#comment-14949187 ] Glenn Strycker commented on SPARK-11004: Awesome -- thanks, I'll try that out. Is there a way to

[jira] [Commented] (SPARK-11004) MapReduce Hive-like join operations for RDDs

2015-10-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949235#comment-14949235 ] Sean Owen commented on SPARK-11004: --- Per job, no I don't think so. It's a setting on the Spark