Re: Filter on Grouped Data

Raghavendra Pandey Fri, 03 Jul 2015 08:22:29 -0700

Why dont you apply filter first and then Group the data and run
aggregations..
On Jul 3, 2015 1:29 PM, "Megha Sridhar- Cynepia" <megha.sridh...@cynepia.com>
wrote:


> Hi,
>
>
> I have a Spark DataFrame object, which when trimmed, looks like,
>
>
>
> From            To                  Subject        Message-ID
> karen....@xyz.com    ['vance.me...@enron.com',         SEC Inquiry
> <19952575.1075858>
>              'jeannie.mandel...@enron.com',
>              'mary.cl...@enron.com',
>              'sarah.pal...@enron.com']
>
>
>
> elyn.hug...@xyz.com    ['dennis.ve...@enron.com',        Revised
> documents    <33499184.1075858>
>              'gina.tay...@enron.com',
>              'kelly.kimbe...@enron.com']
> .
> .
> .
>
>
> I have run a groupBy("From") on the above dataFrame and obtained a
> GroupedData object as a result. I need to apply a filter on the grouped
> data (for instance, getting the sender who sent maximum number of the mails
> that were addressed to a particular receiver in the "To" list).
> Is there a way to accomplish this by applying filter on grouped data?
>
>
> Thanks,
> Megha
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Filter on Grouped Data

Reply via email to