[ https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486080#comment-15486080 ]
Gunther Hagleitner commented on HIVE-14731: ------------------------------------------- [~pxiong] cross product is different on the data movement layer. In a shuffle join we partition both sides n times and create the pairs (p1,p'1), ..., (pn, p'n). In both the unpartitioned and unpartitioned case we chop the reduce output from both tables into n parts and then form pairs (p1,p'1) ... (p1,p'n) .... (pn, p'1) ... (pn, p'n). All combinations not just where the index matches. The difference between the two will be that in the partitioned case we will partition by join key and have the ability to filter out pairs before routing the data. I think the cases are different enough to deserve a new edge type. > Use Tez cartesian product edge in Hive (unpartitioned case only) > ---------------------------------------------------------------- > > Key: HIVE-14731 > URL: https://issues.apache.org/jira/browse/HIVE-14731 > Project: Hive > Issue Type: Bug > Reporter: Zhiyuan Yang > Assignee: Zhiyuan Yang > Attachments: HIVE-14731.1.patch, HIVE-14731.2.patch > > > Given cartesian product edge is available in Tez now (see TEZ-3230), let's > integrate it into Hive on Tez. This allows us to have more than one reducer > in cross product queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)