Re: DataFrame Column Alias problem

Ram Sriharsha Thu, 21 May 2015 08:26:50 -0700

df.groupBy($"col1").agg(count($"col1").as("c")).show


On Thu, May 21, 2015 at 3:09 AM, SLiZn Liu <sliznmail...@gmail.com> wrote:

> Hi Spark Users Group,
>
> I’m doing groupby operations on my DataFrame *df* as following, to get
> count for each value of col1:
>
> > df.groupBy("col1").agg("col1" -> "count").show // I don't know if I should 
> > write like this.
> col1   COUNT(col1#347)
> aaa    2
> bbb    4
> ccc    4
> ...
> and more...
>
> As I ‘d like to sort by the resulting count, with .sort("COUNT(col1#347)"),
> but the column name of the count result obviously cannot be retrieved in
> advance. Intuitively one might consider acquire column name by column index
> in a fashion of R’s DataFrame, except Spark doesn’t support. I have Googled 
> *spark
> agg alias* and so forth, and checked DataFrame.as in Spark API, neither
> helped on this. Am I the only one who had ever got stuck on this issue or
> anything I have missed?
>
> REGARDS,
> Todd Leo
> 
>

Re: DataFrame Column Alias problem

Reply via email to