Hi
I have an RDD: RDD[(String, scala.Iterable[(Long, Int)])] which I want to print 
into a file, a file for each key string.
I tried to trigger a repartition of the RDD by doing group by on it. The 
grouping gives RDD[(String, scala.Iterable[Iterable[(Long, Int)]])] so  I 
flattened that:
  Rdd.groupByKey().mapValues(x=>x.flatten)

However, when I print with saveAsTextFile I get only 2 files

I was under the impression that groupBy repartitions the data by key and 
saveAsTextFile make a file per partition.
What am I doing wrong here?


Thanks
Adrian

Reply via email to