Hi, I have the following use case.
(1) I have an RDD of edges of a graph (say R). (2) do a groupBy on R (by say source vertex) and call a function F on each group. (3) collect the results from Fs and do some computation (4) repeat the above steps until some criteria is met In (2), the groups are always going to be the same (since R is grouped by source vertex). Question: Is R distributed every iteration (when in (2)) or is it distributed only once when it is created? A sample code snippet is below. while(true) { val res = R.groupBy[VertexId](G).flatMap(F) res.collect.foreach(func) if(criteria) break } Since the groups remain the same, what is the best way to go about implementing the above logic?