[ https://issues.apache.org/jira/browse/IMPALA-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155798#comment-17155798 ]
Tim Armstrong commented on IMPALA-4373: --------------------------------------- IMPALA-9924 does a variant of the LEFT OUTER JOIN rewrite. > Wrong results with correlated WHERE-clause subquery inside a NULL-checking > conditional function. > ------------------------------------------------------------------------------------------------ > > Key: IMPALA-4373 > URL: https://issues.apache.org/jira/browse/IMPALA-4373 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, > Impala 2.8.0, Impala 2.9.0 > Reporter: Alexander Behm > Priority: Critical > Labels: correctness > > Impala may generate an incorrect plan for queries that have a correlated > scalar subquery as a parameter to a NULL-checking conditional function like > ISNULL(). > Example query and incorrect plan: > {code} > select t1.int_col > from functional.alltypessmall as t1 > where t1.int_col >= isnull > ( > ( > SELECT > MAX(t2.bigint_col) > FROM > functional.alltypestiny AS t2 > WHERE > t1.id = t2.id + 10000 > ), > 0 > ) > Fetched 0 row(s) in 1.09s > Single-node plan: > +-----------------------------------------------------------------------+ > | Explain String | > +-----------------------------------------------------------------------+ > | Estimated Per-Host Requirements: Memory=0B VCores=0 | > | | > | PLAN-ROOT SINK | > | | | > | 03:HASH JOIN [LEFT SEMI JOIN] | > | | hash predicates: t1.id = t2.id + 10000 | > | | other join predicates: t1.int_col >= isnull(max(t2.bigint_col), 0) | > | | runtime filters: RF000 <- t2.id + 10000 | > | | | > | |--02:AGGREGATE [FINALIZE] | > | | | output: max(t2.bigint_col) | > | | | group by: t2.id | > | | | | > | | 01:SCAN HDFS [functional.alltypestiny t2] | > | | partitions=4/4 files=4 size=460B | > | | | > | 00:SCAN HDFS [functional.alltypessmall t1] | > | partitions=4/4 files=4 size=6.32KB | > | runtime filters: RF000 -> t1.id | > +-----------------------------------------------------------------------+ > {code} > The query returns an empty result set but instead should return all rows from > t1 because all invocations of the subquery return NULL, and all rows from t1 > satisfy "t1.int_col >= 0". -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org