Alessandro Solimando created HIVE-25852:
-------------------------------------------

             Summary: Introduce IN clauses at the very end of query planning
                 Key: HIVE-25852
                 URL: https://issues.apache.org/jira/browse/HIVE-25852
             Project: Hive
          Issue Type: Bug
          Components: CBO
    Affects Versions: 4.0.0
            Reporter: Alessandro Solimando


Calcite "explodes" IN clauses into the equivalent OR form, and therefore it 
does not handle such clauses in most of the codebase (notably in _RexSimplify_).

In Hive, the same happens, but _HivePointLookupOptimizerRule_ re-introduces IN 
clauses, and it happens in _applyPreJoinOrderingTransforms_ phase, which is 
pretty early and which mixes several other rules which might not fully support 
IN (notably, _HiveReduceExpressionsRule_ which is based on _RexSimplify_).

The problem will become even harder in later versions of Calcite (current is 
1.25) based on SARG, which does not support IN clauses.

IN clauses can be converted into efficient runtime operators, we therefore want 
to keep them in the final plan, intuitively we just want this translation to 
happen in a later step, in order to leave the rest of the codebase (Hive and 
Calcite) unaware of IN clauses.

The goal of the ticket is as follows:
# re-convert the output expression of _HivePointLookupOptimizerRule_ into the 
OR form (keep the logic as-is to benefit from the rule)
# add a rule, in the last step of the planning process, that only converts 
eligible OR expressions into IN clauses



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to