Tanel Kiis created SPARK-34882: ---------------------------------- Summary: RewriteDistinctAggregates can cause a bug if the aggregator does not ignore NULLs Key: SPARK-34882 URL: https://issues.apache.org/jira/browse/SPARK-34882 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0 Reporter: Tanel Kiis
{code:title=group-by.sql} SELECT first(DISTINCT a), last(DISTINCT a), first(DISTINCT b), last(DISTINCT b) FROM testData WHERE a IS NOT NULL AND b IS NOT NULL; {code} {code:title=group-by.sql.out} -- !query SELECT first(DISTINCT a), last(DISTINCT a), first(DISTINCT b), last(DISTINCT b) FROM testData WHERE a IS NOT NULL AND b IS NOT NULL -- !query schema struct<first(DISTINCT a):int,last(DISTINCT a):int,first(DISTINCT b):int,last(DISTINCT b):int> -- !query output 1 3 NULL NULL {code} The results should not be NULL, because NULL inputs are filtered out. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org