[ 
https://issues.apache.org/jira/browse/FLINK-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704259#comment-16704259
 ] 

ASF GitHub Bot commented on FLINK-10884:
----------------------------------------

zhijiangW commented on a change in pull request #7185: [FLINK-10884] 
[yarn/mesos]  adjust  container memory param  to set a safe margin from offheap 
memory
URL: https://github.com/apache/flink/pull/7185#discussion_r237743628
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ContaineredTaskManagerParameters.java
 ##########
 @@ -158,8 +158,10 @@ public static ContaineredTaskManagerParameters create(
 
                // (2) split the remaining Java memory between heap and off-heap
                final long heapSizeMB = 
TaskManagerServices.calculateHeapSizeMB(containerMemoryMB - cutoffMB, config);
-               // use the cut-off memory for off-heap (that was its intention)
-               final long offHeapSizeMB = containerMemoryMB - heapSizeMB;
+               // (3) try to compute the offHeapMemory from a safe margin
+               final long restMemoryMB = containerMemoryMB - heapSizeMB;
+               final long offHeapCutoffMemory = 
calculateOffHeapCutoffMB(config, restMemoryMB);
 
 Review comment:
   Currently we already have a `containerized.heap-cutoff-ratio` for reserving 
some memories for other usages. And the `heapSizeMB` is calculated based on 
`containerMemoryMB - cutoffMB`, so the `heapSizeMB+ offHeapSizeMB` should be 
`containerMemoryMB-cutoffMB`.
   
   You further extend the `cutoff-ratio` to 
`containerized.offheap-cutoff-ratio`. I think there are two options:
   
   1. Adjust the existing `containerized.heap-cutoff-ratio` to 
`containerized.cutoff-ratio` which means reserving some physical memories used 
for both heap and off-heap.
   
   2. Separate into two different parameters as you provide. I am not sure 
whether it can get extra benefits compared with first option. But it may make 
the things a little complicated, because the memory can be further divided into 
heap, direct, native (used by rockdb state backend). The direct and native 
memories can be both regarded as off-heap in general speaking. If to do so, do 
we also need `containerized.offheap-cutoff-min` matched with existing 
`containerized.heap-cutoff-min`?
   
   BTW, I think you can increase the current `containerized.heap-cutoff-ratio` 
and  `containerized.heap-cutoff-min` to avoid container killed because of 
exceeding memories. :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink on yarn  TM container will be killed by nodemanager because of  the 
> exceeded  physical memory.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-10884
>                 URL: https://issues.apache.org/jira/browse/FLINK-10884
>             Project: Flink
>          Issue Type: Bug
>          Components: Cluster Management, Core
>    Affects Versions: 1.5.5, 1.6.2, 1.7.0
>         Environment: version  : 1.6.2 
> module : flink on yarn
> centos  jdk1.8
> hadoop 2.7
>            Reporter: wgcn
>            Assignee: wgcn
>            Priority: Major
>              Labels: pull-request-available, yarn
>
> TM container will be killed by nodemanager because of  the exceeded  
> [physical|http://www.baidu.com/link?url=Y4LyfMDH59n9-Ey16Fo6EFAYltN1e9anB3y2ynhVmdvuIBCkJGdH0hTExKDZRvXNr6hqhwIXs8JjYqesYbx0BOpQDD0o1VjbVQlOC-9MgXi]
>  memory. I found the lanuch context   lanuching TM container  that  
> "container memory =   heap memory+ offHeapSizeMB"  at the class 
> org.apache.flink.runtime.clusterframework.ContaineredTaskManagerParameters   
> from line 160 to 166  I set a safety margin for the whole memory container 
> using. For example  if the container  limit 3g  memory,  the sum memory that  
>  "heap memory+ offHeapSizeMB"  is equal to  2.4g to prevent the container 
> being killed.Do we have the 
> [ready-made|http://www.baidu.com/link?url=ylC8cEafGU6DWAdU9ADcJPNugkjbx6IjtqIIxJ9foX4_Yfgc7ctWmpEpQRettVmBiOy7Wfph7S1UvN5LiJj-G1Rsb--oDw4Z2OEbA5Fj0bC]
>  solution  or I can commit my solution



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to