try mapPartitionsWithIndex .. below is an example I used earlier. myfunc
logic can be further modified as per your need.
val x = sc.parallelize(List(1,2,3,4,5,6,7,8,9), 3)
def myfunc(index: Int, iter: Iterator[Int]) : Iterator[String] = {
  iter.toList.map(x => index + "," + x).iterator
}
x.mapPartitionsWithIndex(myfunc).collect()
res10: Array[String] = Array(0,1, 0,2, 0,3, 1,4, 1,5, 1,6, 2,7, 2,8, 2,9)

On Tue, Jan 12, 2016 at 2:06 PM, Gokula Krishnan D <email2...@gmail.com>
wrote:

> Hello All -
>
> I'm just trying to understand aggregate() and in the meantime got an
> question.
>
> *Is there any way to view the RDD databased on the partition ?.*
>
> For the instance, the following RDD has 2 partitions
>
> val multi2s = List(2,4,6,8,10,12,14,16,18,20)
> val multi2s_RDD = sc.parallelize(multi2s,2)
>
> is there anyway to view the data based on the partitions (0,1).
>
>
> Thanks & Regards,
> Gokula Krishnan* (Gokul)*
>

Reply via email to