I have a use case where my RDD is set up such: Partition 0: K1 -> [V1, V2] K2 -> [V2]
Partition 1: K3 -> [V1] K4 -> [V3] I want to invert this RDD, but only within a partition, so that the operation does not require a shuffle. It doesn't matter if the partitions of the inverted RDD have non unique keys across the partitions, for example: Partition 0: V1 -> [K1] V2 -> [K1, K2] Partition 1: V1 -> [K3] V3 -> [K4] Is there a way to do only a per-partition groupBy, instead of shuffling the entire data?