Jesus Camacho Rodriguez created HIVE-15539:
----------------------------------------------
Summary: Optimize complex multi-insert queries in Calcite
Key: HIVE-15539
URL: https://issues.apache.org/jira/browse/HIVE-15539
Project: Hive
Issue Type: Improvement
Components: Parser
Affects Versions: 2.2.0
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
Currently multi-insert queries are not optimized by Calcite. Proper integration
with Calcite would include creating a _spool_ operator whose output is reused
by every _insert_ statement; however, _spool_ operator has not been added to
Calcite yet (CALCITE-481).
In the meantime, and since complex logic for multi-insert queries is in FROM
clause, we can optimize the FROM clause with Calcite and connect the optimized
result to the original query.
Initially, we will recognize three different cases:
- FROM clause is trivial, e.g., table reference, or not supported. No need to
optimize with Calcite.
- FROM clause is a subquery. Optimize the subquery with Calcite.
- FROM clause is a join. Rewrite join into a subquery and optimize it with
Calcite. Change references in INSERT statements to refer to subquery columns.
This should be beneficial for MERGE statements processing too, since MERGE
statements are treated as multi-insert queries by Hive.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)