Short answer: yes.

Take a look at: http://spark.apache.org/docs/latest/running-on-yarn.html

Look for "memoryOverhead".

On Mon, Jan 12, 2015 at 2:06 PM, Michael Albert
<m_albert...@yahoo.com.invalid> wrote:
> Greetings!
>
> My executors apparently are being terminated because they are
> "running beyond physical memory limits" according to the
> "yarn-hadoop-nodemanager"
> logs on the worker nodes (/mnt/var/log/hadoop on AWS EMR).
> I'm setting the "driver-memory" to 8G.
> However, looking at "stdout" in userlogs, I can see GC going on, but the
> lines look
> like "6G -> 5G(7.2G), 0.45secs", so the GC seems to think that the process
> is using
> about 6G of space, not 8G of space.
> However, "ps aux" shows an RSS hovering just below 8G.
>
> The process does a "mapParitionsWithIndex", and the process uses compression
> which (I believe) calls into the native zlib library
> (the overall purpose is to convert each partition into a "matlab" file).
>
> Could it be that the Yarn container is counting both the memory used by the
> JVM proper and memory used by zlib, but that the GC only "sees" the
> "internal" memory.  So the GC keeps the memory usage "reasonable",
> e.g., 6G in an 8G container, but then zlib grabs some memory, and the
> YARN container then terminates the task?
>
> If so, is there anything I can do so that I tell YARN to watch for a larger
> memory limit than I tell the JVM to use for it's memory?
>
> Thanks!
>
> Sincerely,
>  Mike
>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to