[ 
https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041053#comment-14041053
 ] 

Laljo John Pullokkaran commented on HIVE-5775:
----------------------------------------------

Following may help in reducing the confusion:

1. In design doc the cost formula is for choosing Join Algorithm. The cost 
formula as described in the doc assumes Tez execution.

2. However current work on CBO doesn’t include Join algorithm selection. 
Instead it rearranges Join based on Join cardinality & NDV. In other words Join 
reordering is not depended on Physical Execution Layer (Tez or MR).

3. When we decide to do Join Algorithm Selection we can fit in cost formula for 
both a) MR b) Tez. This way, based on the physical execution layer we can 
select best Join Algorithm/Order. 

4. The cost formula for Join Algorithm selection is not that different between 
MR & Tez (except for intermediate HDFS writes). So assume that CBO can support 
both execution layers rather easily.

5. CBO framework allows you to plug and play any cost model. There is no hard 
coupling.


> Introduce Cost Based Optimizer to Hive
> --------------------------------------
>
>                 Key: HIVE-5775
>                 URL: https://issues.apache.org/jira/browse/HIVE-5775
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Laljo John Pullokkaran
>            Assignee: Laljo John Pullokkaran
>         Attachments: CBO-2.pdf, HIVE-5775.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to