Unlike a map() wherein your task is acting on a row at a time, with
mapPartitions(), the task is passed the  entire content of the partition in
an iterator. You can then return back another iterator as the output. I
don't do scala, but from what I understand from your code snippet... The
iterator x can return all the rows in the partition. But you are returning
back after consuming the first row. Hence you see only 1,4,7 in your
output. These are the first rows of each of your 3 partitions.

Regards
Sab
On 18-Mar-2015 10:50 pm, "ashish.usoni" <ashish.us...@gmail.com> wrote:

> I am trying to understand about mapPartitions but i am still not sure how
> it
> works
>
> in the below example it create three partition
> val parallel = sc.parallelize(1 to 10, 3)
>
> and when we do below
> parallel.mapPartitions( x => List(x.next).iterator).collect
>
> it prints value
> Array[Int] = Array(1, 4, 7)
>
> Can some one please explain why it prints 1,4,7 only
>
> Thanks,
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/mapPartitions-How-Does-it-Works-tp22123.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to