List(x.next).iterator is giving you the first element from each partition,
which would be 1, 4 and 7 respectively.
On 3/18/15, 10:19 AM, ashish.usoni ashish.us...@gmail.com wrote:
I am trying to understand about mapPartitions but i am still not sure how
it
works
in the below example it create
What's the best way to go from:
RDD[(A, B)] to (RDD[A], RDD[B])
If I do:
def separate[A, B](k: RDD[(A, B)]) = (k.map(_._1), k.map(_._2))
Which is the obvious solution, this runs two maps in the cluster. Can I do
some kind of a fold instead:
def separate[A, B](l: List[(A, B)]) =
So the page that talks about settings:
http://spark.apache.org/docs/1.2.1/configuration.html seems to not apply when
running local contexts. I have a shell script that starts my job:
xport SPARK_MASTER_OPTS=-Dsun.io.serialization.extendedDebugInfo=true
export