Steve Carlin created HIVE-26737:
-----------------------------------
Summary: Subquery returning wrong results when database has
materialized views
Key: HIVE-26737
URL: https://issues.apache.org/jira/browse/HIVE-26737
Project: Hive
Issue Type: Bug
Components: HiveServer2
Reporter: Steve Carlin
When HS2 has materialized views in its registry, subqueries with correlated
variables may return wrong results.
An example of this:
{code:java}
CREATE TABLE t_test1(
id int,
int_col int,
year int,
month int
);
CREATE TABLE t_test2(
id int,
int_col int,
year int,
month int
);
CREATE TABLE dummy (
id int
) stored as orc TBLPROPERTIES ('transactional'='true');
CREATE MATERIALIZED VIEW need_a_mat_view_in_registry AS
SELECT * FROM dummy where id > 5;
INSERT INTO t_test1 VALUES (1, 1, 2009, 1), (10,0, 2009, 1);
INSERT INTO t_test2 VALUES (1, 1, 2009, 1);
select id, int_col, year, month from t_test1 s where s.int_col = (select
count(*) from t_test2 t where s.id = t.id) order by id;
{code}
The select statement should produce 2 rows, but it is only producing one.
The CBO plan produced has an inner join instead of a left join.
{code:java}
HiveSortLimit(sort0=[$0], dir0=[ASC])
HiveProject(id=[$0], int_col=[$1], year=[$2], month=[$3])
HiveJoin(condition=[AND(=($0, $5), =($4, $6))], joinType=[inner],
algorithm=[none], cost=[not available])
HiveProject(id=[$0], int_col=[$1], year=[$2], month=[$3],
CAST=[CAST($1):BIGINT])
HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT
NULL(CAST($1):BIGINT))])
HiveTableScan(table=[[default, t_test1]], table:alias=[s])
HiveProject(id=[$0], $f1=[$1])
HiveFilter(condition=[IS NOT NULL($1)])
HiveAggregate(group=[{0}], agg#0=[count()])
HiveFilter(condition=[IS NOT NULL($0)])
HiveTableScan(table=[[default, t_test2]], table:alias=[t]){code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)