[ 
https://issues.apache.org/jira/browse/HIVE-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-20366:
-------------------------------
    Status: Patch Available  (was: Open)

> TPC-DS query78 stats estimates are off for is null filter
> ---------------------------------------------------------
>
>                 Key: HIVE-20366
>                 URL: https://issues.apache.org/jira/browse/HIVE-20366
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>            Priority: Major
>         Attachments: HIVE-20366.1.patch, HIVE-20366.2.patch, 
> HIVE-20366.3.patch, HIVE-20366.4.patch
>
>
> In Query 78, there is Left outer join between fact table combos: stores_sales 
> LOJ store_returns, catalog_sales LOJ catalog_returns and web_sales LOJ 
> web_returns. Each of these joins estimates only a single row and the result 
> is BROADCAST and causes hash table memory errors
> {code}
>          Reducer 12                                 |
> |             Execution mode: vectorized, llap       |
> |             Reduce Operator Tree:                  |
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> |               Map Join Operator                    |
> |                 condition map:                     |
> |                      Left Outer Join 0 to 1        |
> |                 keys:                              |
> |                   0 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |                   1 KEY.reducesinkkey0 (type: bigint), KEY.reducesinkkey1 
> (type: bigint) |
> |                 outputColumnNames: _col0, _col1, _col3, _col4, _col5, 
> _col6, _col8 |
> |                 input vertices:                    |
> |                   1 Map 14                         |
> |                 Statistics: Num rows: 10282477384 Data size: 534184867432 
> Basic stats: COMPLETE Column stats: COMPLETE |
> |                 Filter Operator                    |
> |                   predicate: _col8 is null (type: boolean) |
> |                  * Statistics: Num rows: 1* Data size: 52 Basic stats: 
> COMPLETE Column stats: COMPLETE |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to