[ 
https://issues.apache.org/jira/browse/SPARK-49350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

GANHONGNAN updated SPARK-49350:
-------------------------------
    Description: 
{code:java}
SELECT  cast(-1 AS BIGINT) AS ele1
FROM    (            
       SELECT  array(1, 5, 3, 123, 255, 546, 64, 23) AS t        
   ) LATERAL VIEW explode(t) tmp AS ele
WHERE   ele=-1 {code}
This query returns an empty result. However, the following query returns 1.  
This result seems wrong.
{code:java}
SELECT  count(DISTINCT ele1)
FROM    (            
    SELECT  cast(-1 as bigint) as ele1            
    FROM    (                        
        SELECT  array(1, 5, 3, 123, 255, 546, 64, 23) AS t                      
 
        ) LATERAL VIEW explode(t) tmp AS ele            
    WHERE   ele = -1        
) {code}
By plan change log, I find that it is FoldablePropagation rule and 
ConstantFolding rule that optimize Aggregate expression to `Aggregat 
[[cast(count(distinct -1) as string) AS count(DISTINCT ele)#7|#7]] ]`.

 

I think it needs to be fixed.

 

  was:
{code:java}
SELECT  cast(-1 AS BIGINT) AS ele1
FROM    (            
       SELECT  array(1, 5, 3, 123, 255, 546, 64, 23) AS t        
   ) LATERAL VIEW explode(t) tmp AS ele
WHERE   ele=-1 {code}
This query returns an empty result. However, the following query returns 1.  
This result seems wrong.
{code:java}
SELECT  count(DISTINCT ele1)
FROM    (            
    SELECT  cast(-1 as bigint) as ele1            
    FROM    (                        
        SELECT  array(1, 5, 3, 123, 255, 546, 64, 23) AS t                      
 
        ) LATERAL VIEW explode(t) tmp AS ele            
    WHERE   ele = -1        
) {code}
By plan change log, I find that it is FoldablePropagation rule and 
ConstantFolding rule that optimize Aggregate expression to `Aggregat 
[[cast(count(distinct -1) as string) AS count(DISTINCT ele)#7|#7]]`.

 

I think it needs to be fixed.

 


> FoldablePropagation rule and ConstantFolding rule leads to wrong aggregated 
> result
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-49350
>                 URL: https://issues.apache.org/jira/browse/SPARK-49350
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.1
>            Reporter: GANHONGNAN
>            Priority: Blocker
>
> {code:java}
> SELECT  cast(-1 AS BIGINT) AS ele1
> FROM    (            
>        SELECT  array(1, 5, 3, 123, 255, 546, 64, 23) AS t        
>    ) LATERAL VIEW explode(t) tmp AS ele
> WHERE   ele=-1 {code}
> This query returns an empty result. However, the following query returns 1.  
> This result seems wrong.
> {code:java}
> SELECT  count(DISTINCT ele1)
> FROM    (            
>     SELECT  cast(-1 as bigint) as ele1            
>     FROM    (                        
>         SELECT  array(1, 5, 3, 123, 255, 546, 64, 23) AS t                    
>    
>         ) LATERAL VIEW explode(t) tmp AS ele            
>     WHERE   ele = -1        
> ) {code}
> By plan change log, I find that it is FoldablePropagation rule and 
> ConstantFolding rule that optimize Aggregate expression to `Aggregat 
> [[cast(count(distinct -1) as string) AS count(DISTINCT ele)#7|#7]] ]`.
>  
> I think it needs to be fixed.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to