[ 
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220885#comment-16220885
 ] 

Zhiyuan Yang commented on HIVE-14731:
-------------------------------------

[~kgyrtkirk] The main reason is mapjoin decide parallelism according to 
#splits, which is not good enough for cross product. The cost of xprod is 
mostly determined by #records instead of #split.

> Use Tez cartesian product edge in Hive (unpartitioned case only)
> ----------------------------------------------------------------
>
>                 Key: HIVE-14731
>                 URL: https://issues.apache.org/jira/browse/HIVE-14731
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>         Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, 
> HIVE-14731.11.patch, HIVE-14731.12.patch, HIVE-14731.13.patch, 
> HIVE-14731.14.patch, HIVE-14731.15.patch, HIVE-14731.16.patch, 
> HIVE-14731.17.patch, HIVE-14731.18.patch, HIVE-14731.19.patch, 
> HIVE-14731.2.patch, HIVE-14731.20.patch, HIVE-14731.21.patch, 
> HIVE-14731.22.patch, HIVE-14731.23.patch, HIVE-14731.3.patch, 
> HIVE-14731.4.patch, HIVE-14731.5.patch, HIVE-14731.6.patch, 
> HIVE-14731.7.patch, HIVE-14731.8.patch, HIVE-14731.9.patch, 
> HIVE-14731.addendum.patch
>
>
> Given cartesian product edge is available in Tez now (see TEZ-3230), let's 
> integrate it into Hive on Tez. This allows us to have more than one reducer 
> in cross product queries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to