However this returns a single column of c, without showing the original col1 .
On Thu, May 21, 2015 at 11:25 PM Ram Sriharsha <sriharsha....@gmail.com> wrote: > df.groupBy($"col1").agg(count($"col1").as("c")).show > > On Thu, May 21, 2015 at 3:09 AM, SLiZn Liu <sliznmail...@gmail.com> wrote: > >> Hi Spark Users Group, >> >> I’m doing groupby operations on my DataFrame *df* as following, to get >> count for each value of col1: >> >> > df.groupBy("col1").agg("col1" -> "count").show // I don't know if I should >> > write like this. >> col1 COUNT(col1#347) >> aaa 2 >> bbb 4 >> ccc 4 >> ... >> and more... >> >> As I ‘d like to sort by the resulting count, with >> .sort("COUNT(col1#347)"), but the column name of the count result >> obviously cannot be retrieved in advance. Intuitively one might consider >> acquire column name by column index in a fashion of R’s DataFrame, except >> Spark doesn’t support. I have Googled *spark agg alias* and so forth, >> and checked DataFrame.as in Spark API, neither helped on this. Am I the >> only one who had ever got stuck on this issue or anything I have missed? >> >> REGARDS, >> Todd Leo >> >> > >