Yes. That did not help.

Best Regards,
Ram
From: Ted Yu <yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>>
Date: Wednesday, December 2, 2015 at 3:25 PM
To: Ram VISWANADHA 
<ram.viswana...@dailymotion.com<mailto:ram.viswana...@dailymotion.com>>
Cc: user <user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: Improve saveAsTextFile performance

Have you tried calling coalesce() before saveAsTextFile ?

Cheers

On Wed, Dec 2, 2015 at 3:15 PM, Ram VISWANADHA 
<ram.viswana...@dailymotion.com<mailto:ram.viswana...@dailymotion.com>> wrote:
JavaRDD.saveAsTextFile is taking a long time to succeed. There are 10 tasks, 
the first 9 complete in a reasonable time but the last task is taking a long 
time to complete. The last task contains the maximum number of records like 90% 
of the total number of records.  Is there any way to parallelize the execution 
by increasing the number of tasks or evenly distributing the number of records 
to different tasks?

Thanks in advance.

Best Regards,
Ram

Reply via email to