If you want more partitions then you have specify it as:
Rdd.groupByKey(*10*).mapValues...
I think if you don't specify anything, the # partitions will be the #
cores that you have for processing.
Thanks
Best Regards
On Sat, Mar 14, 2015 at 12:28 AM, Adrian Mocanu amoc...@verticalscope.com
Hi
I have an RDD: RDD[(String, scala.Iterable[(Long, Int)])] which I want to print
into a file, a file for each key string.
I tried to trigger a repartition of the RDD by doing group by on it. The
grouping gives RDD[(String, scala.Iterable[Iterable[(Long, Int)]])] so I
flattened that: