[jira] [Updated] (HIVE-26135) Invalid Anti join conversion may cause missing results

2022-04-25 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-26135:

Description: 
right now I think the following is needed to trigger the issue:
* left outer join
* only select left hand side columns
* conditional which is using some udf
* the nullness of the udf is checked

repro sql; in case the conversion happens the row with 'a' will be missing
{code}
drop table if exists t;
drop table if exists n;

create table t(a string) stored as orc;
create table n(a string) stored as orc;

insert into t values ('a'),('1'),('2'),(null);
insert into n values ('a'),('b'),('1'),('3'),(null);


explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a 
is null) is null;
explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
float) is null;


select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;
set hive.auto.convert.anti.join=false;
select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;

{code}

resultset with hive.auto.convert.anti.join enabled:
{code}
+--+
| n.a  |
+--+
| b|
| 3|
+--+
{code}

correct resultset with hive.auto.convert.anti.join disabled:
{code}
+---+
|  n.a  |
+---+
| a |
| b |
| 3 |
| NULL  |
+---+
{code}


workaround could be to disable the feature:
{code}
set hive.auto.convert.anti.join=false;
{code}


  was:
right now I think the following is needed to trigger the issue:
* left outer join
* only select left hand side columns
* conditional which is using some udf
* the nullness of the udf is checked

repro sql; in case the conversion happens the row with 'a' will be missing
{code}
drop table if exists t;
drop table if exists n;

create table t(a string) stored as orc;
create table n(a string) stored as orc;

insert into t values ('a'),('1'),('2'),(null);
insert into n values ('a'),('b'),('1'),('3'),(null);


explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a 
is null) is null;
explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
float) is null;


select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;
set hive.auto.convert.anti.join=false;
select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;

{code}



workaround could be to disable the feature:
{code}
set hive.auto.convert.anti.join=false;
{code}



> Invalid Anti join conversion may cause missing results
> --
>
> Key: HIVE-26135
> URL: https://issues.apache.org/jira/browse/HIVE-26135
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> right now I think the following is needed to trigger the issue:
> * left outer join
> * only select left hand side columns
> * conditional which is using some udf
> * the nullness of the udf is checked
> repro sql; in case the conversion happens the row with 'a' will be missing
> {code}
> drop table if exists t;
> drop table if exists n;
> create table t(a string) stored as orc;
> create table n(a string) stored as orc;
> insert into t values ('a'),('1'),('2'),(null);
> insert into n values ('a'),('b'),('1'),('3'),(null);
> explain select n.* from n left outer join t on (n.a=t.a) where 
> assert_true(t.a is null) is null;
> explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
> float) is null;
> select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
> null;
> set hive.auto.convert.anti.join=false;
> select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
> null;
> {code}
> resultset with hive.auto.convert.anti.join enabled:
> {code}
> +--+
> | n.a  |
> +--+
> | b|
> | 3|
> +--+
> {code}
> correct resultset with hive.auto.convert.anti.join disabled:
> {code}
> +---+
> |  n.a  |
> +---+
> | a |
> | b |
> | 3 |
> | NULL  |
> +---+
> {code}
> workaround could be to disable the feature:
> {code}
> set hive.auto.convert.anti.join=false;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26135) Invalid Anti join conversion may cause missing results

2022-04-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26135:
--
Labels: pull-request-available  (was: )

> Invalid Anti join conversion may cause missing results
> --
>
> Key: HIVE-26135
> URL: https://issues.apache.org/jira/browse/HIVE-26135
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> right now I think the following is needed to trigger the issue:
> * left outer join
> * only select left hand side columns
> * conditional which is using some udf
> * the nullness of the udf is checked
> repro sql; in case the conversion happens the row with 'a' will be missing
> {code}
> drop table if exists t;
> drop table if exists n;
> create table t(a string) stored as orc;
> create table n(a string) stored as orc;
> insert into t values ('a'),('1'),('2'),(null);
> insert into n values ('a'),('b'),('1'),('3'),(null);
> explain select n.* from n left outer join t on (n.a=t.a) where 
> assert_true(t.a is null) is null;
> explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
> float) is null;
> select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
> null;
> set hive.auto.convert.anti.join=false;
> select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
> null;
> {code}
> workaround could be to disable the feature:
> {code}
> set hive.auto.convert.anti.join=false;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)