Using the spark-csv package or outputting to text files, you end up with files named:
test.csv/part-00 rather than a more user-friendly "test.csv", even if there's only 1 part file. We can merge the files using the Hadoop merge command with something like this code from http://deploymentzone.com/2015/01/30/spark-and-merged-csv-files/ def merge(sc: SparkContext, srcPath: String, dstPath: String): Unit = { val srcFileSystem = FileSystem.get(new URI(srcPath), sc.hadoopConfiguration) val dstFileSystem = FileSystem.get(new URI(dstPath), sc.hadoopConfiguration) dstFileSystem.delete(new Path(dstPath), true) FileUtil.copyMerge(srcFileSystem, new Path(srcPath), dstFileSystem, new Path(dstPath), true, sc.hadoopConfiguration, null) } but does anyone know a way without dropping down to Hadoop.fs code? Thanks, Ewan