[jira] [Commented] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2015-09-30 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14938243#comment-14938243
 ] 

Reynold Xin commented on SPARK-3862:


David,

Thanks. Let's chat there. Since I created the ticket, I have new thoughts on 
how we can make something better with codegen, rather than writing specialized 
operators.


> MultiWayBroadcastInnerHashJoin
> --
>
> Key: SPARK-3862
> URL: https://issues.apache.org/jira/browse/SPARK-3862
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Reynold Xin
>
> It is common to have a single fact table inner join many small dimension 
> tables.  We can exploit this fact and create a MultiWayBroadcastInnerHashJoin 
> (or maybe just MultiwayDimensionJoin) operator that optimizes for this 
> pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2015-09-29 Thread David Sabater (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935054#comment-14935054
 ] 

David Sabater commented on SPARK-3862:
--

Thanks Reynold - I am really interested in this feature and happy to contribute 
in whatever format, I see there is a Jira task opened around that actually!
https://issues.apache.org/jira/browse/SPARK-3863

Let me know how I can contribute please, I am actually attending Spark Summit 
EU so we may see each other to talk about potential use cases and ways to 
collaborate.


Regards. 

> MultiWayBroadcastInnerHashJoin
> --
>
> Key: SPARK-3862
> URL: https://issues.apache.org/jira/browse/SPARK-3862
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Reynold Xin
>
> It is common to have a single fact table inner join many small dimension 
> tables.  We can exploit this fact and create a MultiWayBroadcastInnerHashJoin 
> (or maybe just MultiwayDimensionJoin) operator that optimizes for this 
> pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2015-07-29 Thread Reynold Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646376#comment-14646376
 ] 

Reynold Xin commented on SPARK-3862:


I don't think that's extreme at all -- very plausible candidate for 1.6!


 MultiWayBroadcastInnerHashJoin
 --

 Key: SPARK-3862
 URL: https://issues.apache.org/jira/browse/SPARK-3862
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 It is common to have a single fact table inner join many small dimension 
 tables.  We can exploit this fact and create a MultiWayBroadcastInnerHashJoin 
 (or maybe just MultiwayDimensionJoin) operator that optimizes for this 
 pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2015-07-29 Thread David Sabater (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645915#comment-14645915
 ] 

David Sabater commented on SPARK-3862:
--

This may sound too extreme but it will be great to have an option in SparkSQL 
to broadcast these dimension tables before even actually run the queries, which 
I think will speed up the actual query execution massively (Other SQL MPP 
engines are doing that already).
It will be a call similar to CACHE but replicating all partitions accross all 
nodes.

 MultiWayBroadcastInnerHashJoin
 --

 Key: SPARK-3862
 URL: https://issues.apache.org/jira/browse/SPARK-3862
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 It is common to have a single fact table inner join many small dimension 
 tables.  We can exploit this fact and create a MultiWayBroadcastInnerHashJoin 
 (or maybe just MultiwayDimensionJoin) operator that optimizes for this 
 pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2015-04-02 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392214#comment-14392214
 ] 

Apache Spark commented on SPARK-3862:
-

User 'chenghao-intel' has created a pull request for this issue:
https://github.com/apache/spark/pull/5326

 MultiWayBroadcastInnerHashJoin
 --

 Key: SPARK-3862
 URL: https://issues.apache.org/jira/browse/SPARK-3862
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 It is common to have a single fact table inner join many small dimension 
 tables.  We can exploit this fact and create a MultiWayBroadcastInnerHashJoin 
 (or maybe just MultiwayDimensionJoin) operator that optimizes for this 
 pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3862) MultiWayBroadcastInnerHashJoin

2014-10-28 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187759#comment-14187759
 ] 

Apache Spark commented on SPARK-3862:
-

User 'rxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/2985

 MultiWayBroadcastInnerHashJoin
 --

 Key: SPARK-3862
 URL: https://issues.apache.org/jira/browse/SPARK-3862
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Reporter: Reynold Xin

 It is common to have a single fact table inner join many small dimension 
 tables.  We can exploit this fact and create a MultiWayBroadcastInnerHashJoin 
 (or maybe just MultiwayDimensionJoin) operator that optimizes for this 
 pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org