GuangMing Lu created HIVE-26018: ----------------------------------- Summary: The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR Key: HIVE-26018 URL: https://issues.apache.org/jira/browse/HIVE-26018 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 3.1.0, 4.0.0 Reporter: GuangMing Lu Attachments: image-2022-03-09-21-08-17-835.png
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example: CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc; CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc; insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333'); insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333'); SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key); Hive on Tez result: wrong +--------+--------+ | a.key | b.key | +--------+--------+ | aaa | aaa | | bbb | NULL | | ccc | ccc | | NULL | ddd | +--------+--------+ Hive on MR result: right +--------+--------+ | a.key | b.key | +--------+--------+ | aaa | aaa | | bbb | NULL | | ccc | ccc | +--------+--------+ SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key); Hive on Tez result: wrong +--------+--------+ | a.key | b.key | +--------+--------+ | aaa | aaa | | bbb | NULL | | ccc | ccc | | NULL | ddd | +--------+--------+ Hive on MR result: right +--------+--------+ | a.key | b.key | +--------+--------+ | aaa | aaa | | ccc | ccc | +--------+--------+ -- This message was sent by Atlassian Jira (v8.20.1#820001)