Customized Aggregation Query on Spark SQL

Wenlei Xie Fri, 24 Apr 2015 21:34:14 -0700

Hi,

I would like to answer the following customized aggregation query on Spark
SQL
1. Group the table by the value of Name
2. For each group, choose the tuple with the max value of Age (the ages are
distinct for every name)


I am wondering what's the best way to do it on Spark SQL? Should I use
UDAF? Previously I am doing something like the following on Spark:

personRDD.map(t => (t.name, t))
    .reduceByKey((a, b) => if (a.age > b.age) a else b)

Thank you!

Best,
Wenlei

Customized Aggregation Query on Spark SQL

Reply via email to