a little heavy in terms of performance so if you
just want to count, you should probably use approxCountDistinct
Assaf.
From: Devi P.V [mailto:devip2...@gmail.com]
Sent: Thursday, December 08, 2016 10:38 AM
To: user @spark
Subject: How to find unique values after groupBy() in spark dataf
Hi all,
I have a dataframe like following,
+-+--+
|client_id|Date |
+ +--+
| a |2016-11-23|
| b |2016-11-18|
| a |2016-11-23|
| a |2016-11-23|
| a |2016-11-24|
+-+--+
I want to find unique dates of each client_id