[ 
https://issues.apache.org/jira/browse/IMPALA-9162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976161#comment-16976161
 ] 

Aman Sinha commented on IMPALA-9162:
------------------------------------

I dumped the relevant log entries below and I believe the equivalence analysis 
for the predicates is still missing something, so we end up creating the 
predicate twice and one of them remains un-assigned and ends up at the outer 
join.  

Note: x.c3 = x.max_c3  should be equivalent to v1.c3 = v1.max_c3 

{noformat}
Analyzer.java:2009] 8549552c4e5139c9:7de47ac400000000] Created inferred 
predicate: BinaryPredicate{op==, SlotRef{label=x.c3, type=INT, id=15} 
SlotRef{label=x.max_c3, type=INT, id=16}, isInferred=true}

Analyzer.java:2009] 8549552c4e5139c9:7de47ac400000000] Created inferred 
predicate: BinaryPredicate{op==, SlotRef{label=v.v1.c3, type=INT, id=12} 
SlotRef{label=v.v1.max_c3, type=INT, id=13}, isInferred=true}
{noformat}


> Incorrect redundant predicate applied to outer join
> ---------------------------------------------------
>
>                 Key: IMPALA-9162
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9162
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>            Priority: Major
>         Attachments: create.sql.txt
>
>
> Run the attached create.sql script to create the tables and view.  The 
> following query shows an incorrect redundant predicate applied to the outer 
> join.  This seems another variant of past issues such as IMPALA-7957 and 
> IMPALA-8386.  
> {noformat}
> // Has a redundant predicate as 'Other predicates' on Outer Join
> Query: explain select x.* from (select v1.c3, v1.max_c3 from v.t2 left join 
> v.v1 on  t2.c2=v1.c3) as x
>                                                                               
>     
>  06:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]
>    hash predicates: c3 = t2.c2
>    other predicates: c3 = max(c3)
>    runtime filters: RF000 <- t2.c2
>    row-size=20B cardinality=397
>                    
>  --13:EXCHANGE [HASH(t2.c2)]                          
>                                                                     
>    00:SCAN HDFS [v.t2]                                            
>       HDFS partitions=1/1 files=1 size=639B
>       row-size=4B cardinality=397                               
>                                                                               
>        
>  12:EXCHANGE [HASH(c3)]                                                       
>        
>                                                                               
>        
>  05:HASH JOIN [INNER JOIN, BROADCAST]                                         
>        
>    hash predicates: c3 = max(c3)                                              
>        
>    runtime filters: RF002 <- max(c3)                                          
>        
>    row-size=16B cardinality=207       
> {noformat}
>          
> By comparison, the following query which does not have the v1.max_c3 column 
> in the SELECT list produces the correct plan:
> {noformat}
> // Does not have the redundant predicate
> Query: explain select x.* from (select v1.c3 from v.t2 left join v.v1 on  
> t2.c2=v1.c3) as x
>  06:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED]
>    hash predicates: c3 = t2.c2
>    runtime filters: RF000 <- t2.c2
>    row-size=20B cardinality=397
>  --13:EXCHANGE [HASH(t2.c2)]
>    00:SCAN HDFS [v.t2]
>       HDFS partitions=1/1 files=1 size=639B
>       row-size=4B cardinality=397
>  12:EXCHANGE [HASH(c3)]
>  05:HASH JOIN [INNER JOIN, BROADCAST]
>    hash predicates: c3 = max(c3)
>    runtime filters: RF002 <- max(c3)
>    row-size=16B cardinality=207
> {noformat}
> Due the redundant predicate, the first query produces wrong results. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to