Stamatis Zampetakis created HIVE-28911:
------------------------------------------

             Summary: Improve SEARCH expansion to exploit <> operator 
                 Key: HIVE-28911
                 URL: https://issues.apache.org/jira/browse/HIVE-28911
             Project: Hive
          Issue Type: Improvement
          Components: CBO
            Reporter: Stamatis Zampetakis
            Assignee: Stamatis Zampetakis


During various CBO transformations (especially during simplifications) the 
internal SEARCH (CALCITE-4173) operator is introduced in the plan. The SEARCH 
operator cannot be executed directly and must be expanded (using 
SearchTransformer) to an equivalent form for further processing.

The SEARCH operator can be used to represent many types of range predicates 
including the inequality operator (<>).

+Example+
{code:sql}
explain cbo
select d_date_sk
from date_dim
where d_dom <> 10 and d_dom <> 20;
{code}

The intermediate plan before SEARCH expansion is shown below.
{noformat}
HiveProject(d_date_sk=[$0])
  HiveFilter(condition=[SEARCH($9, Sarg[(-∞..10), (10..20), (20..+∞)])])
    HiveTableScan(table=[[default, date_dim]], table:alias=[date_dim])
{noformat}
The two inequalities were converted to a Sarg with three ranges.

The final plan after SEARCH expansion is shown below.
{noformat}
HiveProject(d_date_sk=[$0])
  HiveFilter(condition=[OR(<($9, 10), >($9, 20), AND(>($9, 10), <($9, 20)))])
    HiveTableScan(table=[[default, date_dim]], table:alias=[date_dim])
{noformat}

The conversion to ranges/Sarg is useful cause it allows us the optimizer to 
perform much more powerful simplifications especially for complex predicates. 
However, the expanded expression for this simple range is sub-optimal.

Ideally, the final filter condition after expansion should be the following:
{noformat}
AND(<>($9, 10), <>($9, 20))
{noformat}

The goal of this ticket is to be able to exploit the inequality operator when 
expanding ranges to generate simpler and slightly more efficient expressions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to