Hi I have an RDD: RDD[(String, scala.Iterable[(Long, Int)])] which I want to print into a file, a file for each key string. I tried to trigger a repartition of the RDD by doing group by on it. The grouping gives RDD[(String, scala.Iterable[Iterable[(Long, Int)]])] so I flattened that: Rdd.groupByKey().mapValues(x=>x.flatten)
However, when I print with saveAsTextFile I get only 2 files I was under the impression that groupBy repartitions the data by key and saveAsTextFile make a file per partition. What am I doing wrong here? Thanks Adrian