Re: YARN terminating TaskNode

Timur Fayruzov Mon, 25 Apr 2016 11:28:07 -0700

Great answer, thanks you Max for a very detailed explanation! Illuminating
how off-heap parameter affects the memory allocation.


I read this post:
https://blogs.oracle.com/jrockit/entry/why_is_my_jvm_process_larger_t

and the thing that jumped on me is the allocation of memory for jni libs. I
do use a native library in my application, which is likely the culprit. I
need to account for its memory footprint when doing my memory calculations.

Thanks,
Timur


On Mon, Apr 25, 2016 at 10:28 AM, Maximilian Michels <m...@apache.org> wrote:

> Hi Timur,
>
> Shedding some light on the memory calculation:
>
> You have a total memory size of 2500 MB for each TaskManager. The
> default for 'taskmanager.memory.fraction' is 0.7. This is the fraction
> of the memory used by the memory manager. When you have turned on
> off-heap memory, this memory is allocated off-heap. As you pointed
> out, the default Yarn cutoff ratio is 0.25.
>
> Memory cutoff for Yarn: 2500 * 0.25 MB = 625 MB
>
> Java heap size with off-heap disabled: 2500 MB - 625 MB = 1875 MB
>
> Java heap size with off-heap enabled: (2500 MB - 625 MB) * 0.3 = 562,5
> MB (~570 MB in your case)
> Off-heap memory size: (2500 MB - 625 MB) * 0.7 = 1312,5 MB
>
> The heap memory limits in your log seem to be calculated correctly.
> Note that we don't set a strict limit for the off-heap memory because
> the Flink memory manager controls the amount of memory allocated. It
> will preallocate memory when you have 'taskmanager.memory.preallocate'
> set to true. Otherwise it will allocate dynamically. Still, you should
> have about 500 MB memory left with everything allocated. There is some
> more direct (off-heap) memory allocated for the network stack
> adjustable with 'taskmanager.network.numberOfBuffers' which is set to
> 2048 by default and corresponds to 2048 * 32 KB = 64 MB memory. I
> believe this can grow up to twice of that size. Still, should be
> enough memory left.
>
> Are you running a streaming or batch job? Off-heap memory and memory
> preallocation are mostly beneficial for batch jobs which use the
> memory manager a lot for sorting, hashing and caching.
>
> For streaming I'd suggest to use Flink's defaults:
>
> taskmanager.memory.off-heap: false
> taskmanager.memory.preallocate: false
>
> Raising the cutoff ratio should prevent killing of the TaskManagers.
> As Robert mentioned, in practice the JVM tends to allocate more than
> the maximum specified heap size. You can put the following in your
> flink-conf.yaml:
>
> # slightly raise the cut off ratio (might need to be even higher)
> yarn.heap-cutoff-ratio: 0.3
>
> Thanks,
> Max
>
> On Mon, Apr 25, 2016 at 5:52 PM, Timur Fayruzov
> <timur.fairu...@gmail.com> wrote:
> > Hello Maximilian,
> >
> > I'm using 1.0.0 compiled with Scala 2.11 and Hadoop 2.7. I'm running
> this on
> > EMR. I didn't see any exceptions in other logs. What are the logs you are
> > interested in?
> >
> > Thanks,
> > Timur
> >
> > On Mon, Apr 25, 2016 at 3:44 AM, Maximilian Michels <m...@apache.org>
> wrote:
> >>
> >> Hi Timur,
> >>
> >> Which version of Flink are you using? Could you share the entire logs?
> >>
> >> Thanks,
> >> Max
> >>
> >> On Mon, Apr 25, 2016 at 12:05 PM, Robert Metzger <rmetz...@apache.org>
> >> wrote:
> >> > Hi Timur,
> >> >
> >> > The reason why we only allocate 570mb for the heap is because you are
> >> > allocating most of the memory as off heap (direct byte buffers).
> >> >
> >> > In theory, the memory footprint of the JVM is limited to 570 (heap) +
> >> > 1900
> >> > (direct mem) = 2470 MB (which is below 2500). But in practice thje JVM
> >> > is
> >> > allocating more memory, causing these killings by YARN.
> >> >
> >> > I have to check the code of Flink again, because I would expect the
> >> > safety
> >> > boundary to be much larger than 30 mb.
> >> >
> >> > Regards,
> >> > Robert
> >> >
> >> >
> >> > On Fri, Apr 22, 2016 at 9:47 PM, Timur Fayruzov
> >> > <timur.fairu...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hello,
> >> >>
> >> >> Next issue in a string of things I'm solving is that my application
> >> >> fails
> >> >> with the message 'Connection unexpectedly closed by remote task
> >> >> manager'.
> >> >>
> >> >> Yarn log shows the following:
> >> >>
> >> >> Container
> [pid=4102,containerID=container_1461341357870_0004_01_000015]
> >> >> is
> >> >> running beyond physical memory limits. Current usage: 2.5 GB of 2.5
> GB
> >> >> physical memory used; 9.0 GB of 12.3 GB virtual memory used. Killing
> >> >> container.
> >> >> Dump of the process-tree for container_1461341357870_0004_01_000015 :
> >> >>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> >> >> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
> FULL_CMD_LINE
> >> >>         |- 4102 4100 4102 4102 (bash) 1 7 115806208 715 /bin/bash -c
> >> >> /usr/lib/jvm/java-1.8.0/bin/java -Xms570m -Xmx570m
> >> >> -XX:MaxDirectMemorySize=1900m
> >> >>
> >> >>
> -Dlog.file=/var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.log
> >> >> -Dlogback.configurationFile=file:logback.xml
> >> >> -Dlog4j.configuration=file:log4j.properties
> >> >> org.apache.flink.yarn.YarnTaskManagerRunner --configDir . 1>
> >> >>
> >> >>
> /var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.out
> >> >> 2>
> >> >>
> >> >>
> /var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.err
> >> >>         |- 4306 4102 4102 4102 (java) 172258 40265 9495257088 646460
> >> >> /usr/lib/jvm/java-1.8.0/bin/java -Xms570m -Xmx570m
> >> >> -XX:MaxDirectMemorySize=1900m
> >> >>
> >> >>
> -Dlog.file=/var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.log
> >> >> -Dlogback.configurationFile=file:logback.xml
> >> >> -Dlog4j.configuration=file:log4j.properties
> >> >> org.apache.flink.yarn.YarnTaskManagerRunner --configDir .
> >> >>
> >> >> One thing that drew my attention is `-Xmx570m`. I expected it to be
> >> >> TaskManagerMemory*0.75 (due to yarn.heap-cutoff-ratio). I run the
> >> >> application as follows:
> >> >> HADOOP_CONF_DIR=/etc/hadoop/conf flink run -m yarn-cluster -yn 18
> -yjm
> >> >> 4096 -ytm 2500 eval-assembly-1.0.jar
> >> >>
> >> >> In flink logs I do see 'Task Manager memory: 2500'. When I look at
> the
> >> >> yarn container logs on the cluster node I see that it starts with
> >> >> 570mb,
> >> >> which puzzles me. When I look at the actually allocated memory for a
> >> >> Yarn
> >> >> container using 'top' I see 2.2GB used. Am I interpreting these
> >> >> parameters
> >> >> correctly?
> >> >>
> >> >> I also have set (it failed in the same way without this as well):
> >> >> taskmanager.memory.off-heap: true
> >> >>
> >> >> Also, I don't understand why this happens at all. I assumed that
> Flink
> >> >> won't overcommit allocated resources and will spill to the disk when
> >> >> running
> >> >> out of heap memory. Appreciate if someone can shed light on this too.
> >> >>
> >> >> Thanks,
> >> >> Timur
> >> >
> >> >
> >
> >
>

Re: YARN terminating TaskNode

Reply via email to