[ 
https://issues.apache.org/jira/browse/HIVE-26018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GuangMing Lu updated HIVE-26018:
--------------------------------
    Description: 
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and 
the result Is not correct, for example:

CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;

insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');

SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE  T2_n1x 
b (b.key);

Hive on Tez result: wrong
|a.key  |b.key  |
|aaa    |aaa    |
|bbb    |NULL  |
|ccc    |ccc    |
|NULL  |ddd    |

+------------------+
Hive on MR result: right
|a.key  |b.key  |
|aaa    |aaa    |
|bbb    |NULL  |
|ccc    |ccc    |

+-----------------+

SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);

Hive on Tez result: wrong

+-------------------+
|a.key  |b.key  |
|aaa    |aaa    |
|bbb    |NULL  |
|ccc    |ccc    |
|NULL  |ddd    |

+-----------------+

Hive on MR result: right
|a.key  |b.key  |
|aaa    |aaa    |
|ccc    |ccc    |

 

 

  was:
The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and 
the result Is not correct, for example:

CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;

insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');

SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE  T2_n1x 
b (b.key);

Hive on Tez result: wrong

{+}-------{-}{-}{+}-------+
|a.key  |b.key  |
|aaa    |aaa    |
|bbb    |NULL  |
|ccc    |ccc    |
|NULL  |ddd    |

{+}-------{-}{-}{+}-------+
Hive on MR result: right

{+}-------{-}{-}{+}-------+
|a.key  |b.key  |

 
|aaa    |aaa    |
|bbb    |NULL  |
|ccc    |ccc    |

{+}-------{-}{-}{+}-------+

SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);

Hive on Tez result: wrong

{+}-------{-}{-}{+}-------+
|a.key  |b.key  |

{+}-------{-}{-}{+}-------+
|aaa    |aaa    |
|bbb    |NULL  |
|ccc    |ccc    |
|NULL  |ddd    |

{+}-------{-}{-}{+}-------+

Hive on MR result: right

{+}-------{-}{-}{+}-------+
|a.key  |b.key  |

{+}-------{-}{-}{+}-------+
|aaa    |aaa    |
|ccc    |ccc    |

{+}-------{-}{-}{+}-------+

 


> The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR
> -----------------------------------------------------------------------
>
>                 Key: HIVE-26018
>                 URL: https://issues.apache.org/jira/browse/HIVE-26018
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 3.1.0, 4.0.0
>            Reporter: GuangMing Lu
>            Priority: Major
>         Attachments: image-2022-03-09-21-08-17-835.png
>
>
> The result of UNIQUEJOIN on Hive on Tez is inconsistent with that of MR, and 
> the result Is not correct, for example:
> CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc;
> CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc;
> insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333');
> insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333');
> SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE  
> T2_n1x b (b.key);
> Hive on Tez result: wrong
> |a.key  |b.key  |
> |aaa    |aaa    |
> |bbb    |NULL  |
> |ccc    |ccc    |
> |NULL  |ddd    |
> +------------------+
> Hive on MR result: right
> |a.key  |b.key  |
> |aaa    |aaa    |
> |bbb    |NULL  |
> |ccc    |ccc    |
> +-----------------+
> SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key);
> Hive on Tez result: wrong
> +-------------------+
> |a.key  |b.key  |
> |aaa    |aaa    |
> |bbb    |NULL  |
> |ccc    |ccc    |
> |NULL  |ddd    |
> +-----------------+
> Hive on MR result: right
> |a.key  |b.key  |
> |aaa    |aaa    |
> |ccc    |ccc    |
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to