[
https://issues.apache.org/jira/browse/HIVE-29657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18088199#comment-18088199
]
Stamatis Zampetakis commented on HIVE-29657:
--------------------------------------------
The idea has occurred while working on the PR of HIVE-28911:
https://github.com/apache/hive/pull/6503#discussion_r3356514833
> Improve SEARCH expansion by considering negation
> ------------------------------------------------
>
> Key: HIVE-29657
> URL: https://issues.apache.org/jira/browse/HIVE-29657
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Reporter: Stamatis Zampetakis
> Priority: Major
>
> SEARCH is an internal logical operator that is used by CBO to represent many
> type of range predicates. Currently, it doesn't have a physical equivalent so
> we have to expand it to existing primitive operators (via
> [SearchTransformer|https://github.com/apache/hive/blob/45bdea4f8bb62873ab2b2f4a8f7afc4c780d4aa3/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/SearchTransformer.java]).
> A single SEARCH expression can be expanded in different ways with potentially
> different performance implications.
> +Example:+
> {noformat}
> SEARCH($0, Sarg[(-∞..10), (10..20), (20..30], [50..+∞)])
> {noformat}
> +Expansion A+
> {code:sql}
> ... WHERE x < 10 OR (x > 10 AND x < 20) OR (x > 20 and x <= 30) OR (x >= 50)
> {code}
> However, negating the SEARCH can gives us simpler and potentially more
> efficient expressions. The following SEARCH expression is equivalent with the
> above.
> +Example:+
> {noformat}
> NOT SEARCH($0, Sarg[10, 20, (30..50)])
> {noformat}
> +Expansion B+
> {code:sql}
> ... WHERE NOT ( x = 10 OR x = 20 OR (x > 30 AND x < 50))
> ... WHERE x <> 10 AND x <> 20 AND (x <= 30 OR x >= 50)
> ... WHERE x NOT IN (10, 20) AND (x <= 30 OR x >= 50)
> {code}
> Expansions in category B are undeniably simpler and better in terms of
> memory/CPU consumption.
> The goal of this ticket is to consider both the positive and negative SEARCH
> expression during the expansion (e.g., using
> org.apache.calcite.util.Sarg#negate) and pick the one that is simpler and
> more efficient.
> We have to be careful though not to introduce an overly complex and expensive
> expansion check cause that would beat the entire purpose of this
> transformation; compilation should remain fast. If that is not possible then
> we should close this ticket as won't fix.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)