Martin Rueckl created SPARK-47397: ------------------------------------- Summary: count_distinct ignores null values Key: SPARK-47397 URL: https://issues.apache.org/jira/browse/SPARK-47397 Project: Spark Issue Type: Bug Components: Documentation, Spark Core Affects Versions: 3.4.1 Reporter: Martin Rueckl Attachments: image-2024-03-14-16-12-35-267.png
The documentation states, that in group by and count statements, null values will not be ignored / form their own groups. !image-2024-03-14-16-09-13-065.png|width=757,height=138! !image-2024-03-14-16-09-20-045.png|width=441,height=327! However, the behavior of count_distinct does not account for nulls. Either the documentation or the implementation is wrong here... !image-2024-03-14-16-11-37-714.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org