Hi Kannan,

I am not sure I have understood what your question is exactly, but maybe the 
reduceByKey or reduceByKeyLocally functionality is better to your need.



Best,
Yifan LI





> On 17 Feb 2015, at 17:37, Vijayasarathy Kannan <kvi...@vt.edu> wrote:
> 
> Hi,
> 
> I am working on a Spark application that processes graphs and I am trying to 
> do the following.
> 
> - group the vertices (key - vertex, value - set of its outgoing edges)
> - distribute each key to separate processes and process them (like mapper)
> - reduce the results back at the main process
> 
> Does the "groupBy" functionality do the distribution by default?
> Do we have to explicitly use RDDs to enable automatic distribution?
> 
> It'd be great if you could help me understand these and how to go about with 
> the problem.
> 
> Thanks.

Reply via email to