Artem Velykorodnyi created HIVE-22031:
-----------------------------------------
Summary: HiveRelDecorrelator fails with IndexOutOfBoundsException
if the query contains several "constant" columns
Key: HIVE-22031
URL: https://issues.apache.org/jira/browse/HIVE-22031
Project: Hive
Issue Type: Bug
Components: CBO
Affects Versions: 2.3.5
Reporter: Artem Velykorodnyi
Assignee: Artem Velykorodnyi
Steps for reproducing:
{code}
1. Create table orders
create table orders (ORD_NUM INT, CUST_CODE STRING);
2. Create table customers
create table customers (CUST_CODE STRING);
3. Make select with constants and with a subquery:
select DISTINCT(CUST_CODE), '777' as ANY, ORD_NUM, '888' as CONSTANT
from orders
WHERE not exists
(select 1
from customers
WHERE CUST_CODE=orders.CUST_CODE
);
{code}
Query fails with IndexOutOfBoundsException
{code}
Exception in thread "main" java.lang.AssertionError: Internal error: While
invoking method 'public
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$Frame
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject)
throws org.apache.hadoop.hive.ql.parse.SemanticException'
at org.apache.calcite.util.Util.newInternal(Util.java:792)
at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:534)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.getInvoke(HiveRelDecorrelator.java:660)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelate(HiveRelDecorrelator.java:252)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateQuery(HiveRelDecorrelator.java:218)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1347)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1261)
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
at
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1069)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1085)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:364)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138)
at
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
... 32 more
Caused by: java.lang.AssertionError: Internal error: While invoking method
'public
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$Frame
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate)
throws org.apache.hadoop.hive.ql.parse.SemanticException'
at org.apache.calcite.util.Util.newInternal(Util.java:792)
at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:534)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.getInvoke(HiveRelDecorrelator.java:660)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(HiveRelDecorrelator.java:854)
... 37 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
... 39 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 2
at java.util.ArrayList.rangeCheckForAdd(ArrayList.java:665)
at java.util.ArrayList.add(ArrayList.java:477)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(HiveRelDecorrelator.java:833)
... 44 more
{code}
HiveRelDecorrelator looking for omitted constants and put them into TreeMap
where keys are the numbers of columns in top-level select query.
For query from example TreeMap contains:
{code}
0 = {TreeMap$Entry@8389} "1" -> "_UTF-16LE'777'"
1 = {TreeMap$Entry@8390} "3" -> "_UTF-16LE'888'"
{code}
After that, there is step where List of fields is combined with contsants from
TreeMap
{code}
if (!omittedConstants.isEmpty()) {
final List<RexNode> postProjects = new ArrayList<>(relBuilder.fields());
for (Map.Entry<Integer, RexLiteral> entry
: omittedConstants.descendingMap().entrySet()) {
postProjects.add(entry.getKey() + frame.corDefOutputs.size(),
entry.getValue());
}
relBuilder.project(postProjects);
}
{code}
But TreeMap is descending, so firstly goes constant columns with high position
number, greater than target List size.
(For query from example there is an attempt to add an element to the List with
index 3, but the size of List is only 2).
If we use TreeMap without descending - everything goes as expected. Also, there
is no difference between descending and ascending map, because the List is
filled using indexes but not sequential position.
"Q file" with the query from the example works fine but fails on the Hive 2.3.5.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)