[ 
https://issues.apache.org/jira/browse/HIVE-13254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205238#comment-15205238
 ] 

Prasanth Jayachandran commented on HIVE-13254:
----------------------------------------------

[~jcamachorodriguez] To give more context on this issue. Most TPCDS queries 
joins fact table with multiple dimension tables followed by some aggregation. 
The query provided in the description is one such kind. The plans generated for 
the query is attached with and without hive.transpose.aggr.join enabled. The 
execution time for all such queries is significantly slower (in order of 
4x-6x). For this specific query, without this optimization it took 17.9s on 1TB 
and with hive.transpose.aggr.join enabled the execution time is 98.24s.

My initial understanding of this optimization is, only map-side GBY will be 
pushed through the join. This means reduction in the number of rows that will 
be broadcasted to the fact table. But looking at the plans, the GBY is pushed 
at the logical level which then gets compiled to map-side GBY and reduce-side 
GBY followed by JOIN. This shuffles approximately 600M before joining which I 
think is adding to the overall execution time. I filed this issue thinking that 
the map-side GBY will broadcast it's output to fact table for join.  Per 
[~ashutoshc]'s comment below, it seems like we don't support that. 

> GBY cardinality estimation is wrong partition columns is involved
> -----------------------------------------------------------------
>
>                 Key: HIVE-13254
>                 URL: https://issues.apache.org/jira/browse/HIVE-13254
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.3.0, 2.0.0, 2.1.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Jesus Camacho Rodriguez
>         Attachments: q3.svg, q3_ef_transpose_aggr.svg
>
>
> When running the following query on TPCDS-1000 scale, setting 
> hive.transpose.aggr.join=true is expected to generate optimal plan but it was 
> not generating. 
> {code:title=Query}
> SELECT `date_dim`.`d_day_name` AS `d_day_name`, 
>        `item`.`i_category`     AS `i_category` 
> FROM   `store_sales` `store_sales` 
>        INNER JOIN `item` `item` 
>                ON ( `store_sales`.`ss_item_sk` = `item`.`i_item_sk` ) 
>        INNER JOIN `date_dim` `date_dim` 
>                ON ( `store_sales`.`ss_sold_date_sk` = `date_dim`.`d_date_sk` 
> ) 
> GROUP  BY `d_day_name`, 
>           `i_category`;
> {code}
> The reason for that is stats annotation rule for GROUP BY is not considering 
> partition column into account. For the above query, the generated plan is 
> attached. As we can see from the plan, GBY is pushed to fact table 
> (store_sales) but that output of GBY shuffled to perform join instead of 
> MapJoin conversion. This is because of wrong estimation of cardinality/data 
> size of GBY on store_sales (Map 1). 
> What's happening internally is, GBY computes estimated cardinality which in 
> this case is NDV(ss_item_sk) * NDV(ss_sold_date_sk) = 338901 * 1823 ~= 617M. 
> This estimate is wrong as ss_sold_date_sk is partition column and estimator 
> assumes its non-partition column. In this case, not every tasks reads data 
> from all partitions. We need to take estimated task parallelism into account. 
> For example: If task parallelism is determined to be 100 the estimate from 
> GBY should be ~6M which should convert this vertex into map join vertex. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to