[ https://issues.apache.org/jira/browse/SPARK-49350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
GANHONGNAN updated SPARK-49350: ------------------------------- Description: {code:java} SELECT cast(-1 AS BIGINT) AS ele1 FROM ( SELECT array(1, 5, 3, 123, 255, 546, 64, 23) AS t ) LATERAL VIEW explode(t) tmp AS ele WHERE ele=-1 {code} This query returns an empty result. However, the following query returns 1. This result seems wrong. {code:java} SELECT count(DISTINCT ele1) FROM ( SELECT cast(-1 as bigint) as ele1 FROM ( SELECT array(1, 5, 3, 123, 255, 546, 64, 23) AS t ) LATERAL VIEW explode(t) tmp AS ele WHERE ele = -1 ) {code} By plan change log, I find that it is FoldablePropagation rule and ConstantFolding rule that optimize Aggregate expression to `Aggregat [[cast(count(distinct -1) as string) AS count(DISTINCT ele)#7|#7]] ]`. I think it needs to be fixed. was: {code:java} SELECT cast(-1 AS BIGINT) AS ele1 FROM ( SELECT array(1, 5, 3, 123, 255, 546, 64, 23) AS t ) LATERAL VIEW explode(t) tmp AS ele WHERE ele=-1 {code} This query returns an empty result. However, the following query returns 1. This result seems wrong. {code:java} SELECT count(DISTINCT ele1) FROM ( SELECT cast(-1 as bigint) as ele1 FROM ( SELECT array(1, 5, 3, 123, 255, 546, 64, 23) AS t ) LATERAL VIEW explode(t) tmp AS ele WHERE ele = -1 ) {code} By plan change log, I find that it is FoldablePropagation rule and ConstantFolding rule that optimize Aggregate expression to `Aggregat [[cast(count(distinct -1) as string) AS count(DISTINCT ele)#7|#7]]`. I think it needs to be fixed. > FoldablePropagation rule and ConstantFolding rule leads to wrong aggregated > result > ---------------------------------------------------------------------------------- > > Key: SPARK-49350 > URL: https://issues.apache.org/jira/browse/SPARK-49350 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.2.1 > Reporter: GANHONGNAN > Priority: Blocker > > {code:java} > SELECT cast(-1 AS BIGINT) AS ele1 > FROM ( > SELECT array(1, 5, 3, 123, 255, 546, 64, 23) AS t > ) LATERAL VIEW explode(t) tmp AS ele > WHERE ele=-1 {code} > This query returns an empty result. However, the following query returns 1. > This result seems wrong. > {code:java} > SELECT count(DISTINCT ele1) > FROM ( > SELECT cast(-1 as bigint) as ele1 > FROM ( > SELECT array(1, 5, 3, 123, 255, 546, 64, 23) AS t > > ) LATERAL VIEW explode(t) tmp AS ele > WHERE ele = -1 > ) {code} > By plan change log, I find that it is FoldablePropagation rule and > ConstantFolding rule that optimize Aggregate expression to `Aggregat > [[cast(count(distinct -1) as string) AS count(DISTINCT ele)#7|#7]] ]`. > > I think it needs to be fixed. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org