Re: map operation clears custom partitioner

2016-02-22 Thread Silvio Fiorito
You can use mapValues to ensure partitioning is not lost. From: Brian London <brianmlon...@gmail.com<mailto:brianmlon...@gmail.com>> Date: Monday, February 22, 2016 at 1:21 PM To: user <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: map operation c

Re: map operation clears custom partitioner

2016-02-22 Thread Sean Owen
The problem is that your new mapped values may be in the wrong partition, according to your partitioner. Look for methods that have a preservesPartitioning flag, which is a way to indicate that you know the partitioning remains correct. (Like, you partition by keys and didn't change the keys in

map operation clears custom partitioner

2016-02-22 Thread Brian London
It appears that when a custom partitioner is applied in a groupBy operation, it is not propagated through subsequent non-shuffle operations. Is this intentional? Is there any way to carry custom partitioning through maps? I've uploaded a gist that exhibits the behavior.