Re: 答复: GroupBy in Spark / Scala without Agg functions

2018-05-29 Thread Chetan Khatri
I see, Thank you for explanation LInyuxin On Wed, May 30, 2018 at 6:21 AM, Linyuxin wrote: > Hi, > > Why not group by first then join? > > BTW, I don’t think there any difference between ‘distinct’ and ‘group by’ > > > > Source code of 2.1: > > *def *distinct(): Dataset[T] = dropDuplicates() >

答复: GroupBy in Spark / Scala without Agg functions

2018-05-29 Thread Linyuxin
Hi, Why not group by first then join? BTW, I don’t think there any difference between ‘distinct’ and ‘group by’ Source code of 2.1: def distinct(): Dataset[T] = dropDuplicates() … def dropDuplicates(colNames: Seq[String]): Dataset[T] = withTypedPlan { … Aggregate(groupCols, aggCols, logicalPlan)