[ https://issues.apache.org/jira/browse/HIVE-11110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Laljo John Pullokkaran updated HIVE-11110: ------------------------------------------ Summary: Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation (was: Enable HiveJoinAddNotNullRule in CBO) > Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, > improve Filter selectivity estimation > ------------------------------------------------------------------------------------------------------------ > > Key: HIVE-11110 > URL: https://issues.apache.org/jira/browse/HIVE-11110 > Project: Hive > Issue Type: Bug > Components: CBO > Reporter: Jesus Camacho Rodriguez > Assignee: Laljo John Pullokkaran > Attachments: HIVE-11110-branch-1.2.patch, HIVE-11110.1.patch, > HIVE-11110.2.patch, HIVE-11110.4.patch, HIVE-11110.5.patch, > HIVE-11110.6.patch, HIVE-11110.patch > > > Query > {code} > select count(*) > from store_sales > ,store_returns > ,date_dim d1 > ,date_dim d2 > where d1.d_quarter_name = '2000Q1' > and d1.d_date_sk = ss_sold_date_sk > and ss_customer_sk = sr_customer_sk > and ss_item_sk = sr_item_sk > and ss_ticket_number = sr_ticket_number > and sr_returned_date_sk = d2.d_date_sk > and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’); > {code} > The store_sales table is partitioned on ss_sold_date_sk, which is also used > in a join clause. The join clause should add a filter “filterExpr: > ss_sold_date_sk is not null”, which should get pushed the MetaStore when > fetching the stats. Currently this is not done in CBO planning, which results > in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched and considered in > the optimization phase. In particular, this increases the NDV for the join > columns and may result in wrong planning. > Including HiveJoinAddNotNullRule in the optimization phase solves this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)