[ 
https://issues.apache.org/jira/browse/YARN-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140515#comment-15140515
 ] 

Jan Lukavsky commented on YARN-4681:
------------------------------------

[~cnauroth], I tested this patch against our jobs and it kind of helps, but 
doesn't solve the whole problem. Another problem is that we see spikes of 
direct memory allocations (so far I didn't track where exactly they come from), 
but it lead me to a thought, that it might help not to calculate the exact 
memory consumption of a container, but to average it over some time period 
(configurable, default zero, which would lead to the current behavior).

So, first I will modify the patch as you suggest (so that if the Locked field 
is missing, then the behavior of the ProcfsBasedProcessTree will be exactly the 
same as before). I will then try to add the time averaging and let you know if 
it helped.

Regarding the more aggresive strategies, I made some experiments and I don't 
think it would help.



> ProcfsBasedProcessTree should not calculate private clean pages
> ---------------------------------------------------------------
>
>                 Key: YARN-4681
>                 URL: https://issues.apache.org/jira/browse/YARN-4681
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.6.0, 2.7.0
>            Reporter: Jan Lukavsky
>         Attachments: YARN-4681.patch
>
>
> ProcfsBasedProcessTree in Node manager calculates memory used by a process 
> tree by parsing {{/etc/<pid>/smaps}}, where it calculates {{min(Pss, 
> Shared_Dirty) + Private_Dirty + Private_Clean}}. Because not {{mlocked}} 
> private clean pages can be reclaimed by kernel, this should be changed to 
> calculating only {{Locked}} pages instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to