[ 
https://issues.apache.org/jira/browse/HIVE-28910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28910:
----------------------------------
    Labels: pull-request-available  (was: )

> Remove redundant IS NOT NULL predicates when expanding SEARCH
> -------------------------------------------------------------
>
>                 Key: HIVE-28910
>                 URL: https://issues.apache.org/jira/browse/HIVE-28910
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>
> During various CBO transformations (especially during simplifications) the 
> internal SEARCH (CALCITE-4173) operator is introduced in the plan. The SEARCH 
> operator cannot be executed directly and must be expanded (using 
> SearchTransformer) to an equivalent form for further processing.
> Currently the expansion is conservative and does not take into account the 
> context for the evaluation of UNKNOWN/NULL values which in some cases can 
> lead to redundant IS [NOT] NULL predicates in the expanded form.
> +Example+
> {code:sql}
> explain cbo
> select d_date_sk
> from date_dim
> where d_dom IN (1, 2, 3)
>   and d_dom IS NOT NULL
> {code}
> The intermediate plan before SEARCH expansion is shown below:
> {noformat}
> HiveProject(d_date_sk=[$0])
>   HiveFilter(condition=[SEARCH($9, Sarg[1, 2, 3; NULL AS FALSE])])
>     HiveTableScan(table=[[default, date_dim]], table:alias=[date_dim])
> {noformat}
> The final plan  after SEARCH expansion is show below:
> {noformat}
> HiveProject(d_date_sk=[$0])
>   HiveFilter(condition=[AND(IN($9, 1, 2, 3), IS NOT NULL($9))])
>     HiveTableScan(table=[[default, date_dim]], table:alias=[date_dim])
> {noformat}
> Observe that the IS NOT NULL predicate is redundant both in the SQL query and 
> in the plan however the optimizer is not able to remove it. 
> The IS NOT NULL predicate is introduced cause the Sarg indicates that we 
> should handle NULL AS FALSE. However, inside a Filter operator (WHERE clause) 
> nulls are always treated as false so we can skip the addition of the 
> predicate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to