Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16622 )
Change subject: IMPALA-10252: fix invalid runtime filters for outer joins ...................................................................... Patch Set 4: (6 comments) Looks good to me! http://gerrit.cloudera.org:8080/#/c/16622/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16622/4//COMMIT_MSG@13 PS4, Line 13: is nit: if http://gerrit.cloudera.org:8080/#/c/16622/4//COMMIT_MSG@17 PS4, Line 17: x = isnull(y, 1) can return true even if y is NULL. I wonder if the root cause is that the null rows from the inner do not participate in the filtering during run-time, as otherwise, the result should be correct. In the test below, the values to be sent to the outer is isnull(id, 1). Query: explain SELECT s.id, s.int_col % 2, v.id, zeroifnull(v.id + 1) FROM functional_parquet.alltypessmall s LEFT OUTER JOIN ( SELECT id, int_col FROM functional_parquet.alltypestiny t) v ON s.id = v.id WHERE s.int_col = isnull(v.id, 1) ORDER BY s.id ASC LIMIT 10 +------------------------------------------------------------------------------------+ | Explain String | +------------------------------------------------------------------------------------+ | Max Per-Host Resource Reservation: Memory=3.96MB Threads=6 | | Per-Host Resource Estimates: Memory=52MB | | WARNING: The following tables are missing relevant table and/or column statistics. | | functional_parquet.alltypessmall, functional_parquet.alltypestiny | | | | PLAN-ROOT SINK | | | | | 06:MERGING-EXCHANGE [UNPARTITIONED] | | | order by: id ASC | | | limit: 10 | | | | | 03:TOP-N [LIMIT=10] | | | order by: id ASC | | | row-size=12B cardinality=10 | | | | | 02:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] | | | hash predicates: s.id = id | | | other predicates: s.int_col = isnull(id, 1) | | | runtime filters: RF000 <- isnull(id, 1) | | | row-size=12B cardinality=939 | | | | | |--05:EXCHANGE [HASH(id)] | | | | | | | 01:SCAN HDFS [functional_parquet.alltypestiny t] | | | HDFS partitions=4/4 files=4 size=11.92KB | | | row-size=4B cardinality=758 | | | | | 04:EXCHANGE [HASH(s.id)] | | | | | 00:SCAN HDFS [functional_parquet.alltypessmall s] | | HDFS partitions=4/4 files=4 size=14.76KB | | runtime filters: RF000 -> s.int_col | | row-size=8B cardinality=939 | +------------------------------------------------------------------------------------+ Fetched 33 row(s) in 0.02s http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2292 PS4, Line 2292: slots nit: with all NULL slots into a literal. http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2293 PS4, Line 2293: the expression nit: the literal. http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@373 PS4, Line 373: filterSrcNode.getJoinOp().isLeftOuterJoin() || : filterSrcNode.getJoinOp().isFullOuterJoin()) I wonder if we need to add the similar guard for the right outer join too. Not sure if the right outer join has been transformed into left one here already. If so, maybe add a comment. http://gerrit.cloudera.org:8080/#/c/16622/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@388 PS4, Line 388: and and nit: duplication. -- To view, visit http://gerrit.cloudera.org:8080/16622 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I507af1cc8df15bca21e0d8555019997812087261 Gerrit-Change-Number: 16622 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Comment-Date: Mon, 26 Oct 2020 15:30:09 +0000 Gerrit-HasComments: Yes