You can use mapValues to ensure partitioning is not lost. From: Brian London <brianmlon...@gmail.com<mailto:brianmlon...@gmail.com>> Date: Monday, February 22, 2016 at 1:21 PM To: user <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: map operation clears custom partitioner
It appears that when a custom partitioner is applied in a groupBy operation, it is not propagated through subsequent non-shuffle operations. Is this intentional? Is there any way to carry custom partitioning through maps? I've uploaded a gist that exhibits the behavior. https://gist.github.com/BrianLondon/c3c3355d1971971f3ec6