Re: DataFrame Column Alias problem

SLiZn Liu Thu, 21 May 2015 23:22:42 -0700

However this returns a single column of c, without showing the original col1
.


On Thu, May 21, 2015 at 11:25 PM Ram Sriharsha <sriharsha....@gmail.com>
wrote:

> df.groupBy($"col1").agg(count($"col1").as("c")).show
>
> On Thu, May 21, 2015 at 3:09 AM, SLiZn Liu <sliznmail...@gmail.com> wrote:
>
>> Hi Spark Users Group,
>>
>> I’m doing groupby operations on my DataFrame *df* as following, to get
>> count for each value of col1:
>>
>> > df.groupBy("col1").agg("col1" -> "count").show // I don't know if I should 
>> > write like this.
>> col1   COUNT(col1#347)
>> aaa    2
>> bbb    4
>> ccc    4
>> ...
>> and more...
>>
>> As I ‘d like to sort by the resulting count, with
>> .sort("COUNT(col1#347)"), but the column name of the count result
>> obviously cannot be retrieved in advance. Intuitively one might consider
>> acquire column name by column index in a fashion of R’s DataFrame, except
>> Spark doesn’t support. I have Googled *spark agg alias* and so forth,
>> and checked DataFrame.as in Spark API, neither helped on this. Am I the
>> only one who had ever got stuck on this issue or anything I have missed?
>>
>> REGARDS,
>> Todd Leo
>> 
>>
>
>

Re: DataFrame Column Alias problem

Reply via email to