Why dont you apply filter first and then Group the data and run aggregations.. On Jul 3, 2015 1:29 PM, "Megha Sridhar- Cynepia" <megha.sridh...@cynepia.com> wrote:
> Hi, > > > I have a Spark DataFrame object, which when trimmed, looks like, > > > > From To Subject Message-ID > karen....@xyz.com ['vance.me...@enron.com', SEC Inquiry > <19952575.1075858> > 'jeannie.mandel...@enron.com', > 'mary.cl...@enron.com', > 'sarah.pal...@enron.com'] > > > > elyn.hug...@xyz.com ['dennis.ve...@enron.com', Revised > documents <33499184.1075858> > 'gina.tay...@enron.com', > 'kelly.kimbe...@enron.com'] > . > . > . > > > I have run a groupBy("From") on the above dataFrame and obtained a > GroupedData object as a result. I need to apply a filter on the grouped > data (for instance, getting the sender who sent maximum number of the mails > that were addressed to a particular receiver in the "To" list). > Is there a way to accomplish this by applying filter on grouped data? > > > Thanks, > Megha > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >