Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16622 )

Change subject: IMPALA-10252: fix invalid runtime filters for outer joins
......................................................................


Patch Set 4:

(6 comments)

Looks good to me!

http://gerrit.cloudera.org:8080/#/c/16622/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16622/4//COMMIT_MSG@13
PS4, Line 13: is
nit: if


http://gerrit.cloudera.org:8080/#/c/16622/4//COMMIT_MSG@17
PS4, Line 17: x = isnull(y, 1) can return true even if y is NULL.
I wonder if the root cause is that the null rows from the inner do not 
participate in the filtering during run-time, as otherwise, the result should 
be correct.

In the test below, the values to be sent to the outer is isnull(id, 1).

Query: explain SELECT s.id, s.int_col % 2, v.id, zeroifnull(v.id + 1)
FROM functional_parquet.alltypessmall s LEFT OUTER JOIN (
    SELECT id, int_col
    FROM functional_parquet.alltypestiny t) v ON s.id = v.id
WHERE
s.int_col  = isnull(v.id, 1)

ORDER BY s.id ASC
LIMIT 10
+------------------------------------------------------------------------------------+
| Explain String                                                                
     |
+------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=3.96MB Threads=6                    
     |
| Per-Host Resource Estimates: Memory=52MB                                      
     |
| WARNING: The following tables are missing relevant table and/or column 
statistics. |
| functional_parquet.alltypessmall, functional_parquet.alltypestiny             
     |
|                                                                               
     |
| PLAN-ROOT SINK                                                                
     |
| |                                                                             
     |
| 06:MERGING-EXCHANGE [UNPARTITIONED]                                           
     |
| |  order by: id ASC                                                           
     |
| |  limit: 10                                                                  
     |
| |                                                                             
     |
| 03:TOP-N [LIMIT=10]                                                           
     |
| |  order by: id ASC                                                           
     |
| |  row-size=12B cardinality=10                                                
     |
| |                                                                             
     |
| 02:HASH JOIN [LEFT OUTER JOIN, PARTITIONED]                                   
     |
| |  hash predicates: s.id = id                                                 
     |
| |  other predicates: s.int_col = isnull(id, 1)                                
     |
| |  runtime filters: RF000 <- isnull(id, 1)                                    
     |
| |  row-size=12B cardinality=939                                               
     |
| |                                                                             
     |
| |--05:EXCHANGE [HASH(id)]                                                     
     |
| |  |                                                                          
     |
| |  01:SCAN HDFS [functional_parquet.alltypestiny t]                           
     |
| |     HDFS partitions=4/4 files=4 size=11.92KB                                
     |
| |     row-size=4B cardinality=758                                             
     |
| |                                                                             
     |
| 04:EXCHANGE [HASH(s.id)]                                                      
     |
| |                                                                             
     |
| 00:SCAN HDFS [functional_parquet.alltypessmall s]                             
     |
|    HDFS partitions=4/4 files=4 size=14.76KB                                   
     |
|    runtime filters: RF000 -> s.int_col                                        
     |
|    row-size=8B cardinality=939                                                
     |
+------------------------------------------------------------------------------------+
Fetched 33 row(s) in 0.02s


http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2292
PS4, Line 2292: slots
nit: with all NULL slots into a literal.


http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2293
PS4, Line 2293: the expression
nit: the literal.


http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@373
PS4, Line 373: filterSrcNode.getJoinOp().isLeftOuterJoin() ||
             :               filterSrcNode.getJoinOp().isFullOuterJoin())
I wonder if we need to add the similar guard for the right outer join too. Not 
sure if the right outer join has been transformed into left one here already. 
If so, maybe add a comment.


http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@388
PS4, Line 388: and and
nit: duplication.



--
To view, visit http://gerrit.cloudera.org:8080/16622
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I507af1cc8df15bca21e0d8555019997812087261
Gerrit-Change-Number: 16622
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 26 Oct 2020 15:30:09 +0000
Gerrit-HasComments: Yes

Reply via email to