Ah, looks like RDD.coalesce(1) solves one part of the problem. On Wednesday, April 30, 2014 11:15 AM, Peter <thenephili...@yahoo.com> wrote: Hi
Playing around with Spark & S3, I'm opening multiple objects (CSV files) with: val hfile = sc.textFile("s3n://bucket/2014-04-28/") so hfile is a RDD representing 10 objects that were "underneath" 2014-04-28. After I've sorted and otherwise transformed the content, I'm trying to write it back to a single object: sortedMap.values.map(_.mkString(",")).saveAsTextFile("s3n://bucket/concatted.csv") unfortunately this results in a "folder" named concatted.csv with 10 objects underneath, part-00000 .. part-00010, corresponding to the 10 original objects loaded. How can I achieve the desired behaviour of putting a single object named concatted.csv ? I've tried 0.9.1 and 1.0.0-RC3. Thanks! Peter