On Fri, Aug 12, 2016 at 11:55 AM, Lee Becker wrote:
> val df = sc.parallelize(Array(("a", "a"), ("b", "c"), ("c",
> "a"))).toDF("x", "y")
> val grouped = df.groupBy($"x").agg(countDistinct($"y"), collect_set($"y"))
>
This workaround executes with no exceptions:
val
Hi everyone,
I've started experimenting with my codebase to see how much work I will
need to port it from 1.6.1 to 2.0.0. In regressing some of my dataframe
transforms, I've discovered I can no longer pair a countDistinct with a
collect_set in the same aggregation.
Consider:
val df =