Hi Hemant,

did you checkout the dedicated page for memory configuration and troubleshooting:

https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#outofmemoryerror-direct-buffer-memory

https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#container-memory-exceeded

It is likely that the high number of output streams could cause your issues.

Regards,
Timo




On 14.07.21 08:46, bat man wrote:
Hi,
I have a job which reads different streams from 5 kafka topics. It filters data and then data is streamed to different operators for processing. This step involves data shuffling.

Also, once data is enriched in 4 joins(KeyedCoProcessFunction) operators. After joining the data is written to different kafka topics. There are a total of 16 different output streams which are written to 4 topics.

I have been facing some issues with yarn killing containers. I took the heap dump and ran it through JXray [1]. Heap usage is not high. One thing which stands out is off-heap usage which is very high. My guess is this is what is killing the containers as the data inflow increases.

Screenshot 2021-07-14 at 11.52.41 AM.png


From the stack above is this usage high because of many output streams being written to kafka topics. As the stack shows RecordWriter holding off this DirectByteBuffer. I have assigned Network Memory as 1GB, and --MaxDirectMemorySize also shows ~1GB for task managers.

From here[2] I found that setting -Djdk.nio.maxCachedBufferSize=262144 limits the temp buffer cache. Will it help in this case? jvm version used is - JVM: OpenJDK 64-Bit Server VM - Red Hat, Inc. - 1.8/25.282-b08

[1] - https://jxray.com <https://jxray.com>
[2] - https://dzone.com/articles/troubleshooting-problems-with-native-off-heap-memo <https://dzone.com/articles/troubleshooting-problems-with-native-off-heap-memo>

Thanks,
Hemant

Reply via email to