DataFrame Column Alias problem

SLiZn Liu Thu, 21 May 2015 03:10:52 -0700

Hi Spark Users Group,

I’m doing groupby operations on my DataFrame *df* as following, to get
count for each value of col1:


> df.groupBy("col1").agg("col1" -> "count").show // I don't know if I should 
> write like this.
col1   COUNT(col1#347)
aaa    2
bbb    4
ccc    4
...
and more...

As I ‘d like to sort by the resulting count, with .sort("COUNT(col1#347)"),
but the column name of the count result obviously cannot be retrieved in
advance. Intuitively one might consider acquire column name by column index
in a fashion of R’s DataFrame, except Spark doesn’t support. I have
Googled *spark
agg alias* and so forth, and checked DataFrame.as in Spark API, neither
helped on this. Am I the only one who had ever got stuck on this issue or
anything I have missed?

REGARDS,
Todd Leo

DataFrame Column Alias problem

Reply via email to