Martin Rueckl created SPARK-47397:
-------------------------------------

             Summary: count_distinct ignores null values
                 Key: SPARK-47397
                 URL: https://issues.apache.org/jira/browse/SPARK-47397
             Project: Spark
          Issue Type: Bug
          Components: Documentation, Spark Core
    Affects Versions: 3.4.1
            Reporter: Martin Rueckl
         Attachments: image-2024-03-14-16-12-35-267.png

The documentation states, that in group by and count statements, null values 
will not be ignored / form their own groups.
!image-2024-03-14-16-09-13-065.png|width=757,height=138!
!image-2024-03-14-16-09-20-045.png|width=441,height=327!
However, the behavior of count_distinct does not account for nulls. 
Either the documentation or the implementation is wrong here...

!image-2024-03-14-16-11-37-714.png!

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to