Nicholas Brenwald created HIVE-11603:
----------------------------------------
Summary: IndexOutOfBoundsException thrown when accessing a union
all subquery and filtering on a column which does not exist in all underlying
tables
Key: HIVE-11603
URL: https://issues.apache.org/jira/browse/HIVE-11603
Project: Hive
Issue Type: Bug
Affects Versions: 1.3.0
Environment: Hadoop 2.6
Reporter: Nicholas Brenwald
Priority: Minor
Fix For: 2.0.0
Create two empty tables t1 and t2
{code}
CREATE TABLE t1(c1 STRING);
CREATE TABLE t2(c1 STRING, c2 INT);
{code}
Create a view on these two tables
{code}
CREATE VIEW v1 AS
SELECT c1, c2
FROM (
SELECT c1, CAST(NULL AS INT) AS c2 FROM t1
UNION ALL
SELECT c1, c2 FROM t2
) x;
{code}
Then run
{code}
SELECT COUNT(*) from v1
WHERE c2 = 0;
{code}
We expect to get a result of zero, but instead the query fails with stack trace:
{code}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
at
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442)
at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119)
... 22 more
{code}
Workarounds include disabling ppd,
{code}
set hive.optimize.ppd=false;
{code}
Or changing the view so that column c2 is null cast to double:
{code}
CREATE VIEW v1_workaround AS
SELECT c1, c2
FROM (
SELECT c1, CAST(NULL AS DOUBLE) AS c2 FROM t1
UNION ALL
SELECT c1, c2 FROM t2
) x;
{code}
The problem seems to occur in branch-1.1, branch-1.2, branch-1 but seems to be
resolved in master (2.0.0)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)