Thanks!
I was getting a little confused by this partitioner business, I thought
that by default a pairRDD would be partitioned by a HashPartitioner? Was
this possibly the case in 0.9.3 but not in 1.x?
In anycase, I tried your suggestion and the shuffle was removed. Cheers.
One small question
Hi.
I have an RDD that I use repeatedly through many iterations of an
algorithm. To prevent recomputation, I persist the RDD (and incidentally I
also persist and checkpoint it's parents)
val consCostConstraintMap = consCost.join(constraintMap).map {
case (cid, (costs,(mid1,_,mid2,_,_))) = {