Nulls are excluded with spark.sql("SELECT count(distinct col) FROM 
Table").show()
I think it is ANSI SQL behaviour.

scala> spark.sql("select distinct count(null)").show(false)
+-----------+
|count(NULL)|
+-----------+
|0          |
+-----------+

scala> spark.sql("select distinct null").count
res1: Long = 1

Regards,
Hemanth

From: Mohamed Nadjib Mami <mohamed.nadjib.m...@gmail.com>
Date: Thursday, 6 April 2017 at 20.29
To: "user@spark.apache.org" <user@spark.apache.org>
Subject: df.count() returns one more count than SELECT COUNT()

spark.sql("SELECT count(distinct col) FROM Table").show()

Reply via email to