[ https://issues.apache.org/jira/browse/HIVE-26018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
GuangMing Lu updated HIVE-26018: -------------------------------- Description: The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example: CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc; CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc; insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333'); insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333'); SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key); Hive on Tez result: wrong |a.key |b.key | |aaa |aaa | |bbb |NULL | |ccc |ccc | |NULL |ddd | +------------------+ Hive on MR result: right |a.key |b.key | |aaa |aaa | |bbb |NULL | |ccc |ccc | +-----------------+ SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key); Hive on Tez result: wrong +-------------------+ |a.key |b.key | |aaa |aaa | |bbb |NULL | |ccc |ccc | |NULL |ddd | +-----------------+ Hive on MR result: right |a.key |b.key | |aaa |aaa | |ccc |ccc | was: The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and the result Is not correct, for example: CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc; CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc; insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333'); insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333'); SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key); Hive on Tez result: wrong {+}-------{-}{-}{+}-------+ |a.key |b.key | |aaa |aaa | |bbb |NULL | |ccc |ccc | |NULL |ddd | {+}-------{-}{-}{+}-------+ Hive on MR result: right {+}-------{-}{-}{+}-------+ |a.key |b.key | |aaa |aaa | |bbb |NULL | |ccc |ccc | {+}-------{-}{-}{+}-------+ SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key); Hive on Tez result: wrong {+}-------{-}{-}{+}-------+ |a.key |b.key | {+}-------{-}{-}{+}-------+ |aaa |aaa | |bbb |NULL | |ccc |ccc | |NULL |ddd | {+}-------{-}{-}{+}-------+ Hive on MR result: right {+}-------{-}{-}{+}-------+ |a.key |b.key | {+}-------{-}{-}{+}-------+ |aaa |aaa | |ccc |ccc | {+}-------{-}{-}{+}-------+ > The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR > ----------------------------------------------------------------------- > > Key: HIVE-26018 > URL: https://issues.apache.org/jira/browse/HIVE-26018 > Project: Hive > Issue Type: Bug > Components: Tez > Affects Versions: 3.1.0, 4.0.0 > Reporter: GuangMing Lu > Priority: Major > Attachments: image-2022-03-09-21-08-17-835.png > > > The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and > the result Is not correct, for example: > CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc; > CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc; > insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333'); > insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333'); > SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE > T2_n1x b (b.key); > Hive on Tez result: wrong > |a.key |b.key | > |aaa |aaa | > |bbb |NULL | > |ccc |ccc | > |NULL |ddd | > +------------------+ > Hive on MR result: right > |a.key |b.key | > |aaa |aaa | > |bbb |NULL | > |ccc |ccc | > +-----------------+ > SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key); > Hive on Tez result: wrong > +-------------------+ > |a.key |b.key | > |aaa |aaa | > |bbb |NULL | > |ccc |ccc | > |NULL |ddd | > +-----------------+ > Hive on MR result: right > |a.key |b.key | > |aaa |aaa | > |ccc |ccc | > > -- This message was sent by Atlassian Jira (v8.20.1#820001)