[ 
https://issues.apache.org/jira/browse/IMPALA-11323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542726#comment-17542726
 ] 

Joe McDonnell commented on IMPALA-11323:
----------------------------------------

If I change the query so that the nulls have different types (smallint vs int), 
then the query works:
{noformat}
[localhost:21050] default> with t as (select 1 a), v as (select distinct a, 
cast(null as smallint)b, cast(null as int)c from t) select distinct a,b,c from 
v union all select distinct a,b,c from v;

+---+------+------+
| a | b    | c    |
+---+------+------+
| 1 | NULL | NULL |
| 1 | NULL | NULL |
+---+------+------+
Fetched 2 row(s) in 0.15s

explain with t as (select 1 a), v as (select distinct a, cast(null as 
smallint)b, cast(null as int)c from t) select distinct a,b,c from v union all 
select distinct a,b,c from v
+--------------------------------------------------------------+
| Explain String                                               |
+--------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=7.88MB Threads=1   |
| Per-Host Resource Estimates: Memory=24MB                     |
| Codegen disabled by planner                                  |
|                                                              |
| PLAN-ROOT SINK                                               |
| |                                                            |
| 00:UNION                                                     |
| |  pass-through-operands: all                                |
| |  row-size=7B cardinality=2                                 |
| |                                                            |
| |--06:AGGREGATE [FINALIZE]                                   |
| |  |  group by: a, CAST(NULL AS SMALLINT), CAST(NULL AS INT) |
| |  |  row-size=7B cardinality=1                              |
| |  |                                                         |
| |  05:AGGREGATE [FINALIZE]                                   |
| |  |  group by: a, CAST(NULL AS SMALLINT), CAST(NULL AS INT) |
| |  |  row-size=7B cardinality=1                              |
| |  |                                                         |
| |  04:UNION                                                  |
| |     constant-operands=1                                    |
| |     row-size=1B cardinality=1                              |
| |                                                            |
| 03:AGGREGATE [FINALIZE]                                      |
| |  group by: a, CAST(NULL AS SMALLINT), CAST(NULL AS INT)    |
| |  row-size=7B cardinality=1                                 |
| |                                                            |
| 02:AGGREGATE [FINALIZE]                                      |
| |  group by: a, CAST(NULL AS SMALLINT), CAST(NULL AS INT)    |
| |  row-size=7B cardinality=1                                 |
| |                                                            |
| 01:UNION                                                     |
|    constant-operands=1                                       |
|    row-size=1B cardinality=1                                 |
+--------------------------------------------------------------+
 {noformat}
Similarly, if I use 1 instead of null for both b and c, then it works, but it 
still has the b = c predicate. If I use 1 for b and 2 for c, then it doesn't 
have the b=c predicate (and it works).

So, this seems to allow there to be an equivalence based on two nulls, but when 
we actually evaluate null = null, it is false.

> Invalid inferred predicates based on null values being equivalent
> -----------------------------------------------------------------
>
>                 Key: IMPALA-11323
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11323
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.2.0
>            Reporter: Joe McDonnell
>            Priority: Blocker
>              Labels: correctness
>
> This query should return two rows, but instead it returns nothing:
> {noformat}
> [localhost:21050] default> with t as (select 1 a), v as (select distinct a, 
> cast(null as smallint)b, cast(null as smallint)c from t) select distinct 
> a,b,c from v union all select distinct a,b,c from v;
> Fetched 0 row(s) in 0.14s{noformat}
> Without the union all, each side is producing a single row, so it stands to 
> reason that the union all would produce two rows:
> {noformat}
> [localhost:21050] default> with t as (select 1 a), v as (select distinct a, 
> cast(null as smallint)b, cast(null as smallint)c from t) select distinct 
> a,b,c from v;
> +---+------+------+
> | a | b    | c    |
> +---+------+------+
> | 1 | NULL | NULL |
> +---+------+------+
> Fetched 1 row(s) in 0.14s
> {noformat}
> Both b and c are hard coded to null in the definition of v. The plan shows 
> that it is producing a b = c predicate:
> {noformat}
> +-------------------------------------------------------------------+
> | Explain String                                                    |
> +-------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=7.88MB Threads=1        |
> | Per-Host Resource Estimates: Memory=24MB                          |
> | Codegen disabled by planner                                       |
> |                                                                   |
> | PLAN-ROOT SINK                                                    |
> | |                                                                 |
> | 00:UNION                                                          |
> | |  pass-through-operands: all                                     |
> | |  row-size=5B cardinality=2                                      |
> | |                                                                 |
> | |--08:SELECT                                                      |
> | |  |  predicates: b = c                                           |
> | |  |  row-size=5B cardinality=1                                   |
> | |  |                                                              |
> | |  07:AGGREGATE [FINALIZE]                                        |
> | |  |  group by: a, CAST(NULL AS SMALLINT), CAST(NULL AS SMALLINT) |
> | |  |  row-size=5B cardinality=1                                   |
> | |  |                                                              |
> | |  06:AGGREGATE [FINALIZE]                                        |
> | |  |  group by: a, CAST(NULL AS SMALLINT)                         |
> | |  |  row-size=3B cardinality=1                                   |
> | |  |                                                              |
> | |  05:UNION                                                       |
> | |     constant-operands=1                                         |
> | |     row-size=1B cardinality=1                                   |
> | |                                                                 |
> | 04:SELECT                                                         |
> | |  predicates: b = c                                              |
> | |  row-size=5B cardinality=1                                      |
> | |                                                                 |
> | 03:AGGREGATE [FINALIZE]                                           |
> | |  group by: a, CAST(NULL AS SMALLINT), CAST(NULL AS SMALLINT)    |
> | |  row-size=5B cardinality=1                                      |
> | |                                                                 |
> | 02:AGGREGATE [FINALIZE]                                           |
> | |  group by: a, CAST(NULL AS SMALLINT)                            |
> | |  row-size=3B cardinality=1                                      |
> | |                                                                 |
> | 01:UNION                                                          |
> |    constant-operands=1                                            |
> |    row-size=1B cardinality=1                                      |
> +-------------------------------------------------------------------+
> {noformat}
> I added tracing to PlanNode's addConjunctsToNode() and the b=c conjunct is 
> generated by Analyzer::createEquivConjuncts(). It survives the inferred 
> identity predicate checks, because it is referring to two different slots.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to