[ 
https://issues.apache.org/jira/browse/IMPALA-9162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha resolved IMPALA-9162.
--------------------------------
    Fix Version/s: Impala 3.4.0
       Resolution: Fixed

> Incorrect redundant predicate applied to outer join
> ---------------------------------------------------
>
>                 Key: IMPALA-9162
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9162
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>            Priority: Major
>             Fix For: Impala 3.4.0
>
>         Attachments: create.sql.txt
>
>
> Run the attached create.sql script to create the tables and view.  The 
> following query shows an incorrect redundant predicate applied to the outer 
> join.  This seems another variant of past issues such as IMPALA-7957 and 
> IMPALA-8386.  
> {noformat}
> // Has a redundant predicate as 'Other predicates' on Outer Join
> Query: explain select x.* from (select v1.c3, v1.max_c3 from v.t2 left join 
> v.v1 on  t2.c2=v1.c3) as x
>                                                                               
>     
>  06:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]
>    hash predicates: c3 = t2.c2
>    other predicates: c3 = max(c3) .    <-- WRONG
>    runtime filters: RF000 <- t2.c2
>    row-size=20B cardinality=397
>                    
>  --13:EXCHANGE [HASH(t2.c2)]                          
>                                                                     
>    00:SCAN HDFS [v.t2]                                            
>       HDFS partitions=1/1 files=1 size=639B
>       row-size=4B cardinality=397                               
>                                                                               
>        
>  12:EXCHANGE [HASH(c3)]                                                       
>        
>                                                                               
>        
>  05:HASH JOIN [INNER JOIN, BROADCAST]                                         
>        
>    hash predicates: c3 = max(c3)                                              
>        
>    runtime filters: RF002 <- max(c3)                                          
>        
>    row-size=16B cardinality=207       
> {noformat}
>          
> By comparison, the following query which does not have the v1.max_c3 column 
> in the SELECT list produces the correct plan:
> {noformat}
> // Does not have the redundant predicate
> Query: explain select x.* from (select v1.c3 from v.t2 left join v.v1 on  
> t2.c2=v1.c3) as x
>  06:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]
>    hash predicates: c3 = t2.c2
>    runtime filters: RF000 <- t2.c2
>    row-size=20B cardinality=397
>  --13:EXCHANGE [HASH(t2.c2)]
>    00:SCAN HDFS [v.t2]
>       HDFS partitions=1/1 files=1 size=639B
>       row-size=4B cardinality=397
>  12:EXCHANGE [HASH(c3)]
>  05:HASH JOIN [INNER JOIN, BROADCAST]
>    hash predicates: c3 = max(c3)
>    runtime filters: RF002 <- max(c3)
>    row-size=16B cardinality=207
> {noformat}
> Due the redundant predicate, the first query produces wrong results. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to