Great answer, thanks you Max for a very detailed explanation! Illuminating how off-heap parameter affects the memory allocation.
I read this post: https://blogs.oracle.com/jrockit/entry/why_is_my_jvm_process_larger_t and the thing that jumped on me is the allocation of memory for jni libs. I do use a native library in my application, which is likely the culprit. I need to account for its memory footprint when doing my memory calculations. Thanks, Timur On Mon, Apr 25, 2016 at 10:28 AM, Maximilian Michels <m...@apache.org> wrote: > Hi Timur, > > Shedding some light on the memory calculation: > > You have a total memory size of 2500 MB for each TaskManager. The > default for 'taskmanager.memory.fraction' is 0.7. This is the fraction > of the memory used by the memory manager. When you have turned on > off-heap memory, this memory is allocated off-heap. As you pointed > out, the default Yarn cutoff ratio is 0.25. > > Memory cutoff for Yarn: 2500 * 0.25 MB = 625 MB > > Java heap size with off-heap disabled: 2500 MB - 625 MB = 1875 MB > > Java heap size with off-heap enabled: (2500 MB - 625 MB) * 0.3 = 562,5 > MB (~570 MB in your case) > Off-heap memory size: (2500 MB - 625 MB) * 0.7 = 1312,5 MB > > The heap memory limits in your log seem to be calculated correctly. > Note that we don't set a strict limit for the off-heap memory because > the Flink memory manager controls the amount of memory allocated. It > will preallocate memory when you have 'taskmanager.memory.preallocate' > set to true. Otherwise it will allocate dynamically. Still, you should > have about 500 MB memory left with everything allocated. There is some > more direct (off-heap) memory allocated for the network stack > adjustable with 'taskmanager.network.numberOfBuffers' which is set to > 2048 by default and corresponds to 2048 * 32 KB = 64 MB memory. I > believe this can grow up to twice of that size. Still, should be > enough memory left. > > Are you running a streaming or batch job? Off-heap memory and memory > preallocation are mostly beneficial for batch jobs which use the > memory manager a lot for sorting, hashing and caching. > > For streaming I'd suggest to use Flink's defaults: > > taskmanager.memory.off-heap: false > taskmanager.memory.preallocate: false > > Raising the cutoff ratio should prevent killing of the TaskManagers. > As Robert mentioned, in practice the JVM tends to allocate more than > the maximum specified heap size. You can put the following in your > flink-conf.yaml: > > # slightly raise the cut off ratio (might need to be even higher) > yarn.heap-cutoff-ratio: 0.3 > > Thanks, > Max > > On Mon, Apr 25, 2016 at 5:52 PM, Timur Fayruzov > <timur.fairu...@gmail.com> wrote: > > Hello Maximilian, > > > > I'm using 1.0.0 compiled with Scala 2.11 and Hadoop 2.7. I'm running > this on > > EMR. I didn't see any exceptions in other logs. What are the logs you are > > interested in? > > > > Thanks, > > Timur > > > > On Mon, Apr 25, 2016 at 3:44 AM, Maximilian Michels <m...@apache.org> > wrote: > >> > >> Hi Timur, > >> > >> Which version of Flink are you using? Could you share the entire logs? > >> > >> Thanks, > >> Max > >> > >> On Mon, Apr 25, 2016 at 12:05 PM, Robert Metzger <rmetz...@apache.org> > >> wrote: > >> > Hi Timur, > >> > > >> > The reason why we only allocate 570mb for the heap is because you are > >> > allocating most of the memory as off heap (direct byte buffers). > >> > > >> > In theory, the memory footprint of the JVM is limited to 570 (heap) + > >> > 1900 > >> > (direct mem) = 2470 MB (which is below 2500). But in practice thje JVM > >> > is > >> > allocating more memory, causing these killings by YARN. > >> > > >> > I have to check the code of Flink again, because I would expect the > >> > safety > >> > boundary to be much larger than 30 mb. > >> > > >> > Regards, > >> > Robert > >> > > >> > > >> > On Fri, Apr 22, 2016 at 9:47 PM, Timur Fayruzov > >> > <timur.fairu...@gmail.com> > >> > wrote: > >> >> > >> >> Hello, > >> >> > >> >> Next issue in a string of things I'm solving is that my application > >> >> fails > >> >> with the message 'Connection unexpectedly closed by remote task > >> >> manager'. > >> >> > >> >> Yarn log shows the following: > >> >> > >> >> Container > [pid=4102,containerID=container_1461341357870_0004_01_000015] > >> >> is > >> >> running beyond physical memory limits. Current usage: 2.5 GB of 2.5 > GB > >> >> physical memory used; 9.0 GB of 12.3 GB virtual memory used. Killing > >> >> container. > >> >> Dump of the process-tree for container_1461341357870_0004_01_000015 : > >> >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) > >> >> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) > FULL_CMD_LINE > >> >> |- 4102 4100 4102 4102 (bash) 1 7 115806208 715 /bin/bash -c > >> >> /usr/lib/jvm/java-1.8.0/bin/java -Xms570m -Xmx570m > >> >> -XX:MaxDirectMemorySize=1900m > >> >> > >> >> > -Dlog.file=/var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.log > >> >> -Dlogback.configurationFile=file:logback.xml > >> >> -Dlog4j.configuration=file:log4j.properties > >> >> org.apache.flink.yarn.YarnTaskManagerRunner --configDir . 1> > >> >> > >> >> > /var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.out > >> >> 2> > >> >> > >> >> > /var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.err > >> >> |- 4306 4102 4102 4102 (java) 172258 40265 9495257088 646460 > >> >> /usr/lib/jvm/java-1.8.0/bin/java -Xms570m -Xmx570m > >> >> -XX:MaxDirectMemorySize=1900m > >> >> > >> >> > -Dlog.file=/var/log/hadoop-yarn/containers/application_1461341357870_0004/container_1461341357870_0004_01_000015/taskmanager.log > >> >> -Dlogback.configurationFile=file:logback.xml > >> >> -Dlog4j.configuration=file:log4j.properties > >> >> org.apache.flink.yarn.YarnTaskManagerRunner --configDir . > >> >> > >> >> One thing that drew my attention is `-Xmx570m`. I expected it to be > >> >> TaskManagerMemory*0.75 (due to yarn.heap-cutoff-ratio). I run the > >> >> application as follows: > >> >> HADOOP_CONF_DIR=/etc/hadoop/conf flink run -m yarn-cluster -yn 18 > -yjm > >> >> 4096 -ytm 2500 eval-assembly-1.0.jar > >> >> > >> >> In flink logs I do see 'Task Manager memory: 2500'. When I look at > the > >> >> yarn container logs on the cluster node I see that it starts with > >> >> 570mb, > >> >> which puzzles me. When I look at the actually allocated memory for a > >> >> Yarn > >> >> container using 'top' I see 2.2GB used. Am I interpreting these > >> >> parameters > >> >> correctly? > >> >> > >> >> I also have set (it failed in the same way without this as well): > >> >> taskmanager.memory.off-heap: true > >> >> > >> >> Also, I don't understand why this happens at all. I assumed that > Flink > >> >> won't overcommit allocated resources and will spill to the disk when > >> >> running > >> >> out of heap memory. Appreciate if someone can shed light on this too. > >> >> > >> >> Thanks, > >> >> Timur > >> > > >> > > > > > >