Hankó Gergely created HIVE-26846:
------------------------------------
Summary: Exception at non-vectorized map join execution
Key: HIVE-26846
URL: https://issues.apache.org/jira/browse/HIVE-26846
Project: Hive
Issue Type: Bug
Reporter: Hankó Gergely
How to reproduce (the csv files are attached to HIVE-26653):
{code:java}
set hive.auto.convert.join=true;
set hive.vectorized.execution.enabled=false;
CREATE TABLE table_a (`aid` string ) PARTITIONED BY (`p_dt` string) row format
delimited fields terminated by ',' stored as textfile;
CREATE TABLE table_b (`bid` string) PARTITIONED BY (`p_dt` string) row format
delimited fields terminated by ',' stored as textfile;
load data local inpath 'table_a.csv' into table table_a;
load data local inpath 'table_b.csv' into table table_b;
SELECT a.p_dt FROM ((SELECT p_dt FROM table_b GROUP BY p_dt) a JOIN (SELECT
p_dt FROM table_a) b ON a.p_dt = b.p_dt) WHERE a.p_dt =
translate(cast(to_date(date_sub('2022-08-01', 1)) AS string), '-', ''); {code}
Result:
{code:java}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected
exception from MapJoinOperator : Index: 0, Size: 0
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:595)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:173)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:155)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
... 19 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:659)
at java.util.ArrayList.get(ArrayList.java:435)
at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:918)
at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:1013)
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:582)
... 23 more
{code}
Expected:
{code:java}
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731
20220731 {code}
Additional info:
The cause of the exception is that _col1 is pruned at
ColumnPrunerProcFactory:1152. If it is not pruned then the query runs fine.
The query returns all NULLs in vectorized mode (HIVE-26653) and that problem is
not fixed by keeping _col1 so I'm not entirely sure that it's the same issue as
HIVE-26653, but their root cause is probably similar.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)