Save a spark RDD to disk

Elf Of Lothlorein Tue, 08 Nov 2016 14:08:48 -0800

Hi
I am trying to save a RDD to disk and I am using the saveAsNewAPIHadoopFile
for that. I am seeing that it takes almost 20 mins for about 900 GB of
data. Is there any parameter that I can tune to make this saving faster.
I am running about 45 executors with 5 cores each on 5 Spark worker nodes
and using Spark on YARN for this..
Thanks for your help.
C

Save a spark RDD to disk

Reply via email to