Re: Help taking last value in each group (aggregates)

2017-08-28 Thread Everett Anderson
I'm still unclear on if orderBy/groupBy + aggregates is a viable approach or when one could rely on the last or first aggregate functions, but a working alternative is to use window functions with row_number and a filter kind of like this: import spark.implicits._ val reverseOrdering = Seq("a",

Help taking last value in each group (aggregates)

2017-08-28 Thread Everett Anderson
Hi, I'm struggling a little with some unintuitive behavior with the Scala API. (Spark 2.0.2) I wrote something like df.orderBy("a", "b") .groupBy("group_id") .agg(sum("col_to_sum").as("total"), last("row_id").as("last_row_id"))) and expected a result with a unique group_id column, a