[ https://issues.apache.org/jira/browse/FLINK-19125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yun Tang updated FLINK-19125: ----------------------------- Release Note: Adopt Jemalloc as default memory allocator in official docker image's docker-entrypoint.sh to avoid known memory fragmentation problem, and user could also revert back to previous glibc if parameter 'disable-jemalloc' is given. > Avoid memory fragmentation when running flink docker image > ---------------------------------------------------------- > > Key: FLINK-19125 > URL: https://issues.apache.org/jira/browse/FLINK-19125 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes, Runtime / State Backends > Affects Versions: 1.12.0, 1.11.1 > Reporter: Yun Tang > Assignee: Yun Tang > Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > This ticket tracks the problem of memory fragmentation when launching default > Flink docker image. > In FLINK-18712, user reported if he submits job with rocksDB state backend on > a k8s session cluster again and again once it finished, the memory usage of > task manager grows continuously until OOM killed. > I reproduce this problem with official Flink docker image no matter how we > use rocksDB (whether to enable managed memory or not). > I dig into the problem and found this is due to the memory fragmentation > caused by {{glibc}}, which would not return memory to kernel gracefully > (please refer to [glibc > bugzilla|https://sourceware.org/bugzilla/show_bug.cgi?id=15321] and [glibc > manual|https://www.gnu.org/software/libc/manual/html_mono/libc.html#Freeing-after-Malloc]) > I found limiting MALLOC_ARENA_MAX to 2 could mitigate this problem (please > refer to > [choose-for-malloc_arena_max|https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior#what-value-to-choose-for-malloc_arena_max] > for more details). > And if we choose to use jemalloc to allocate memory via rebuilding another > docker image, the problem would be gone. > {code:java} > apt-get -y install libjemalloc-dev > ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so > {code} > Jemalloc intends to [emphasize fragmentation > avoidance|https://github.com/jemalloc/jemalloc/wiki/Background#intended-use] > and we might consider to re-factor our Dockerfile to base on jemalloc to > avoid memory fragmentation. -- This message was sent by Atlassian Jira (v8.3.4#803005)