Raghavendra,

Thanks for the quick reply! I don’t think I included enough information in my 
question. I am hoping to get fields that are not directly part of the 
aggregation. Imagine a dataframe representing website views with a userID, 
datetime, and a webpage address. How could I find the oldest or newest webpage 
address that an user visited? As I understand it you can only access fields 
that are part of the aggregation itself.

Thanks,
Impact


> On Aug 21, 2015, at 11:11 AM, Raghavendra Pandey 
> <raghavendra.pan...@gmail.com> wrote:
> 
> Impact,
> You can group by the data and then sort it by timestamp and take max to 
> select the oldest value.
> 
> On Aug 21, 2015 11:15 PM, "Impact" <nat...@skone.org 
> <mailto:nat...@skone.org>> wrote:
> I am also looking for a way to achieve the reducebykey functionality on data
> frames. In my case I need to select one particular row (the oldest, based on
> a timestamp column value) by key.
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Aggregate-to-array-or-slice-by-key-with-DataFrames-tp23636p24399.html
>  
> <http://apache-spark-user-list.1001560.n3.nabble.com/Aggregate-to-array-or-slice-by-key-with-DataFrames-tp23636p24399.html>
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> <mailto:user-unsubscr...@spark.apache.org>
> For additional commands, e-mail: user-h...@spark.apache.org 
> <mailto:user-h...@spark.apache.org>
> 

Reply via email to