I had similar as #2 problem when I used lot of caching and then doing
shuffling It looks like when I cached too much there was no enough
space for other spark tasks and it just hang on.

That you can try to cache less and see if improve, also executor logs
help a lot (watch out logs with information about spill) you can also
monitor jobs jvms through spark monitoring
http://spark.apache.org/docs/latest/monitoring.html and Graphite and
Grafana.

On Tue, Feb 16, 2016 at 2:14 PM, Iulian Dragoș
<iulian.dra...@typesafe.com> wrote:
> Regarding your 2nd problem, my best guess is that you’re seeing GC pauses.
> It’s not unusual, given you’re using 40GB heaps. See for instance this blog
> post
>
> From conducting numerous tests, we have concluded that unless you are
> utilizing some off-heap technology (e.g. GridGain OffHeap), no Garbage
> Collector provided with JDK will render any kind of stable GC performance
> with heap sizes larger that 16GB. For example, on 50GB heaps we can often
> encounter up to 5 minute GC pauses, with average pauses of 2 to 4 seconds.
>
> Not sure if Yarn can do this, but I would try to run with a smaller executor
> heap, and more executors per node.
>
> iulian
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to