[ 
https://issues.apache.org/jira/browse/SPARK-43760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-43760:
-----------------------------------

    Assignee: Andrey Gubichev

> Incorrect attribute nullability after RewriteCorrelatedScalarSubquery leads 
> to incorrect query results
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-43760
>                 URL: https://issues.apache.org/jira/browse/SPARK-43760
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Andrey Gubichev
>            Assignee: Andrey Gubichev
>            Priority: Major
>             Fix For: 3.5.0
>
>
> The following query:
>  
> {code:java}
> select * from (
>  select t1.id c1, (
>   select t2.id c from range (1, 2) t2
>   where t1.id = t2.id  ) c2
>  from range (1, 3) t1 ) t
> where t.c2 is not null
> -- !query schema
> struct<c1:bigint,c2:bigint>
> -- !query output
> 1     1
> 2     NULL
>  {code}
>  
> should return 1 row, because the second row is supposed to be removed by 
> IsNotNull predicate. However, due to a wrong nullability propagation after 
> subquery decorrelation, the output of the subquery is declared as 
> not-nullable (incorrectly), so the predicate is constant folded into True.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to