Hi dataframe has not boolean option for coalesce it is only for RDD I believe
sourceFrame.coalesce(1,true) //gives compilation error On Wed, Jan 6, 2016 at 1:38 AM, Alexander Pivovarov <apivova...@gmail.com> wrote: > try coalesce(1, true). > > On Tue, Jan 5, 2016 at 11:58 AM, unk1102 <umesh.ka...@gmail.com> wrote: > >> hi I am trying to save many partitions of Dataframe into one CSV file and >> it >> take forever for large data sets of around 5-6 GB. >> >> >> sourceFrame.coalesce(1).write().format("com.databricks.spark.csv").option("gzip").save("/path/hadoop") >> >> For small data above code works well but for large data it hangs forever >> does not move on because of only one partitions has to shuffle data of GBs >> please help me >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/coalesce-1-saveAsTextfile-takes-forever-tp25886.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >