Re: Iterating on RDDs

Imran Rashid Thu, 26 Feb 2015 13:32:34 -0800

val grouped = R.groupBy[VertexId](G).persist(StorageLeve.MEMORY_ONLY_SER)
// or whatever persistence makes more sense for you ...
while(true) {
  val res = grouped.flatMap(F)
  res.collect.foreach(func)
  if(criteria)
     break
}


On Thu, Feb 26, 2015 at 10:56 AM, Vijayasarathy Kannan <kvi...@vt.edu>
wrote:

> Hi,
>
> I have the following use case.
>
> (1) I have an RDD of edges of a graph (say R).
> (2) do a groupBy on R (by say source vertex) and call a function F on each
> group.
> (3) collect the results from Fs and do some computation
> (4) repeat the above steps until some criteria is met
>
> In (2), the groups are always going to be the same (since R is grouped by
> source vertex).
>
> Question:
> Is R distributed every iteration (when in (2)) or is it distributed only
> once when it is created?
>
> A sample code snippet is below.
>
> while(true) {
>   val res = R.groupBy[VertexId](G).flatMap(F)
>   res.collect.foreach(func)
>   if(criteria)
>      break
> }
>
> Since the groups remain the same, what is the best way to go about
> implementing the above logic?
>

Re: Iterating on RDDs

Reply via email to