Not quiet sure, but it could be the GC Pause, if you are holding too much
objects in memory. You can check this tuning
<http://spark.apache.org/docs/1.2.0/tuning.html> part if you haven't
already been through it.

Thanks
Best Regards

On Sat, Jan 31, 2015 at 7:22 AM, Corey Nolet <cjno...@gmail.com> wrote:

> We have a series of spark jobs which run in succession over various cached
> datasets, do small groups and transforms, and then call
> saveAsSequenceFile() on them.
>
> Each call to save as a sequence file appears to have done its work, the
> task says it completed in "xxx.xxxxx seconds" but then it pauses and the
> pauses are quite significant- sometimes up to 2 minutes. We are trying to
> figure out what's going on during this pause- if the executors are really
> still writing to the sequence files or if maybe a race condition is
> happening on an executor which is causing timeouts.
>
> Any ideas? Anyone else seen this happening?
>
>
> We also tried running all the saveAsSequenceFile calls in separate futures
> to see if maybe the waiting would still only take 1-2 minutes but it looks
> like the waiting still takes the sum of the amount  of time it would have
> originally (several several minutes). Our job runs, in its entirety, 35
> minutes and we're estimating that we're spending at least 16 minutes in
> this paused state. What I haven't been able to do is figure out how to
> trace through all the executors. Is there a way to do this? The event logs
> in yarn don't seem to help much with this.
>

Reply via email to