Hi, I am using custom partitioner to partition my JavaPairRDD where key is a String.
I use hashCode of the sub-string of the key to derive the partition index but I have noticed that my partition contains keys which have a different partitionIndex returned by the partitioner. Another issue I am facing is that when I sort the rdd further after partitioning, my partition has only keys which are equal. My Partitioner is as below: public class BlockPartitioner extends Partitioner { private int numPartitions = 8; @Override public int numPartitions() { return numPartitions; } @Override public int getPartition(Object key) { String dept = key.subString(0,7); int partitionId = dept.hashCode(); return partitionId % numPartitions; } } I am using "foreachPartition" of the java pair rddd to verify my partitions. Thanks Ankur