[ 
https://issues.apache.org/jira/browse/HIVE-29175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18017704#comment-18017704
 ] 

Thomas Rebele commented on HIVE-29175:
--------------------------------------

I think this is because of the {{t2.col2=t1.col2}} condition when joining t3. 
The same condition holds when joining t1 and t2. Except if t2 could not be 
joined. In that case, t2.col2 is NULL, therefore t3 will not be joined either.

The plain without antijoin:
{code:java}
HiveProject(col1=[$0], col2=[$1])
  HiveFilter(condition=[IS NULL($4)])
    HiveJoin(condition=[AND(=($4, $0), =($3, $1))], joinType=[left], 
algorithm=[none], cost=[not available])
      HiveJoin(condition=[AND(=($2, $0), =($3, $1))], joinType=[left], 
algorithm=[none], cost=[not available])
        HiveProject(col1=[$0], col2=[$1])
          HiveTableScan(table=[[default, tab1]], table:alias=[t1])
        HiveProject(col1=[$0], col2=[$1])
          HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
            HiveTableScan(table=[[default, tab2]], table:alias=[t2])
      HiveProject(col1=[$0])
        HiveFilter(condition=[IS NOT NULL($0)])
          HiveTableScan(table=[[default, tab1]], table:alias=[t3]) {code}
and with antijoin:
{code:java}
HiveProject(col1=[$0], col2=[$1])
  HiveAntiJoin(condition=[AND(=($4, $0), =($3, $1))], joinType=[anti])
    HiveJoin(condition=[AND(=($2, $0), =($3, $1))], joinType=[left], 
algorithm=[none], cost=[not available])
      HiveProject(col1=[$0], col2=[$1])
        HiveTableScan(table=[[default, tab1]], table:alias=[t1])
      HiveProject(col1=[$0], col2=[$1])
        HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
          HiveTableScan(table=[[default, tab2]], table:alias=[t2])
    HiveProject(col1=[$0])
      HiveFilter(condition=[IS NOT NULL($0)])
        HiveTableScan(table=[[default, tab1]], table:alias=[t3]){code}
Maybe the antijoin condition needs to be adapted.

> Wrong Anti join conversion leading to missing results
> -----------------------------------------------------
>
>                 Key: HIVE-29175
>                 URL: https://issues.apache.org/jira/browse/HIVE-29175
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Dayakar M
>            Assignee: Dayakar M
>            Priority: Major
>
> Wrong Anti join conversion leading to missing results.
> Reproduction steps:
> {noformat}
> create table tab1 (col1 int, col2 int);
> create table tab2 (col1 int, col2 int);
> select t1.col1, t1.col2 from tab1 t1
> left join tab2 t2
> on t2.col1=t1.col1 AND t2.col2=t1.col2
> left join tab1 t3
> on t3.col1=t1.col1 AND t2.col2=t1.col2
> where t3.col1 is null;{noformat}
> When 
> *set hive.auto.convert.anti.join=false;* 
> then above select query returning below rows
> {noformat}
> 123   5000{noformat}
> but when
> *set hive.auto.convert.anti.join=true;*
> then its returning empty rows. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to