GuangMing Lu created HIVE-26018:
-----------------------------------
Summary: The result of UNIQUEJOIN on Hive on Tez is inconsistent
with that of MR
Key: HIVE-26018
URL: https://issues.apache.org/jira/browse/HIVE-26018
Project: Hive
Issue Type: Bug
Components: Tez
Affects Versions: 3.1.0, 4.0.0
Reporter: GuangMing Lu
Attachments: image-2022-03-09-21-08-17-835.png
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and
the result Is not correct, for example:
CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;
insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');
SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x
b (b.key);
Hive on Tez result: wrong
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| bbb | NULL |
| ccc | ccc |
| NULL | ddd |
+--------+--------+
Hive on MR result: right
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| bbb | NULL |
| ccc | ccc |
+--------+--------+
SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);
Hive on Tez result: wrong
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| bbb | NULL |
| ccc | ccc |
| NULL | ddd |
+--------+--------+
Hive on MR result: right
+--------+--------+
| a.key | b.key |
+--------+--------+
| aaa | aaa |
| ccc | ccc |
+--------+--------+
--
This message was sent by Atlassian Jira
(v8.20.1#820001)