Try using collasece function to repartition to desired number of partitions files, to merge already output files use hive and insert overwrite table using below options.
set hive.merge.smallfiles.avgsize=2560000; set hive.merge.size.per.task=2560000; set -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Best-way-to-merge-final-output-part-files-created-by-Spark-job-tp24681p27263.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org