df.groupBy($"col1").agg(count($"col1").as("c")).show
On Thu, May 21, 2015 at 3:09 AM, SLiZn Liu <sliznmail...@gmail.com> wrote: > Hi Spark Users Group, > > I’m doing groupby operations on my DataFrame *df* as following, to get > count for each value of col1: > > > df.groupBy("col1").agg("col1" -> "count").show // I don't know if I should > > write like this. > col1 COUNT(col1#347) > aaa 2 > bbb 4 > ccc 4 > ... > and more... > > As I ‘d like to sort by the resulting count, with .sort("COUNT(col1#347)"), > but the column name of the count result obviously cannot be retrieved in > advance. Intuitively one might consider acquire column name by column index > in a fashion of R’s DataFrame, except Spark doesn’t support. I have Googled > *spark > agg alias* and so forth, and checked DataFrame.as in Spark API, neither > helped on this. Am I the only one who had ever got stuck on this issue or > anything I have missed? > > REGARDS, > Todd Leo > >