Re: Dataframe Grouping - Sorting - Mapping

2016-09-30 Thread Kevin Mellott
When you perform a .groupBy, you need to perform an aggregate immediately
afterwards.

For example:

val df1 = df.groupBy("colA").agg(sum(df1("colB")))
df1.show()

More information and examples can be found in the documentation below.

http://spark.apache.org/docs/1.6.2/api/scala/index.html#org.apache.spark.sql.DataFrame

Thanks,
Kevin

On Fri, Sep 30, 2016 at 5:46 AM, AJT  wrote:

> I'm looking to do the following with my Spark dataframe
> (1) val df1 = df.groupBy()
> (2) val df2 = df1.sort()
> (3) val df3 = df2.mapPartitions()
>
> I can already groupBy the column (in this case a long timestamp) - but have
> no idea how then to ensure the returned GroupedData is then sorted by the
> same timeStamp and the mapped to my set of functions
>
> Appreciate any help
> Thanks
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Dataframe-Grouping-Sorting-Mapping-tp27821.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Dataframe Grouping - Sorting - Mapping

2016-09-30 Thread AJT
I'm looking to do the following with my Spark dataframe
(1) val df1 = df.groupBy()
(2) val df2 = df1.sort()
(3) val df3 = df2.mapPartitions()

I can already groupBy the column (in this case a long timestamp) - but have
no idea how then to ensure the returned GroupedData is then sorted by the
same timeStamp and the mapped to my set of functions

Appreciate any help
Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Dataframe-Grouping-Sorting-Mapping-tp27821.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org