Hi, If I have an RDD[MyClass] and I want to partition it by the hash code of MyClass for performance reasons, is there any way to do this without converting it into a PairRDD RDD[(K,V)] and calling partitionBy???
Mapping it to a tuple2 seems like a waste of space/computation. It looks like the PairRDDFunctions..partitionBy() uses a ShuffleRDD[K,V,C] requires K,V,C? Could I create a new ShuffleRDD[MyClass,MyClass,MyClass](caseClassRdd, new HashParitioner)? Cheers, N