wgcn created FLINK-10884:
----------------------------
Summary: Flink on yarn TM container will be killed by nodemanager
because of the exceeded physical memory.
Key: FLINK-10884
URL: https://issues.apache.org/jira/browse/FLINK-10884
Project: Flink
Issue Type: Bug
Components: Cluster Management, Core
Affects Versions: 1.6.2
Environment: version : 1.6.2
module : flink on yarn
centos jdk1.8
hadoop 2.7
Reporter: wgcn
TM container will be killed by nodemanager because of the exceeded
[physical|http://www.baidu.com/link?url=Y4LyfMDH59n9-Ey16Fo6EFAYltN1e9anB3y2ynhVmdvuIBCkJGdH0hTExKDZRvXNr6hqhwIXs8JjYqesYbx0BOpQDD0o1VjbVQlOC-9MgXi]
memory. I found the lanuch context lanuching TM container that "container
memory = heap memory+ offHeapSizeMB" at the class
org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters
from line 160 to 166 I set a safety margin for the whole memory container
using. For example if the container limit 3g memory, the sum memory that
"heap memory+ offHeapSizeMB" is equal to 2.4g to prevent the container being
killed.Do we have the
[ready-made|http://www.baidu.com/link?url=ylC8cEafGU6DWAdU9ADcJPNugkjbx6IjtqIIxJ9foX4_Yfgc7ctWmpEpQRettVmBiOy7Wfph7S1UvN5LiJj-G1Rsb--oDw4Z2OEbA5Fj0bC]
solution or I can commit my solution
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)