Re: How to improve performance of saveAsTextFile()

2017-03-11 Thread Yan Facai
How about increasing RDD's partitions / rebalancing data? On Sat, Mar 11, 2017 at 2:33 PM, Parsian, Mahmoud wrote: > How to improve performance of JavaRDD.saveAsTextFile(“hdfs://…“). > This is taking over 30 minutes on a cluster of 10 nodes. > Running Spark on YARN. > >

How to improve performance of saveAsTextFile()

2017-03-10 Thread Parsian, Mahmoud
How to improve performance of JavaRDD.saveAsTextFile(“hdfs://…“). This is taking over 30 minutes on a cluster of 10 nodes. Running Spark on YARN. JavaRDD has 120 million entries. Thank you, Best regards, Mahmoud