[ 
https://issues.apache.org/jira/browse/YARN-7693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309129#comment-16309129
 ] 

Jiandan Yang  commented on YARN-7693:
-------------------------------------

[~miklos.szeg...@cloudera.com] Thanks for your attention. This jira does not 
conflict with YARN-7064. I file this jira because currently 
ContainersMonitorImpl has some problems:
1. online service may be crash due to high system resource utilization.
ContainersMonitorImpl only check pmem and vmem of every container,  and did not 
check the overall system utilization. This may be impact online service when 
offline task and online service run on the Yarn at the same time. For example, 
each container's memory did not exceed the limit, but the system's total memory 
utilization may be 100% because of oversubscription, and the decision of 
killing container by RM may not be timely enough, then it will affect the 
online service.
2. Directly kill Opportunistic container is too violent. Dynamically adjusting 
Opportunistic container resources may be a better choice.
So I proposal to:
1) Seperate containers into two different group Opportunistic_Group and 
Guaranteed_Group under *hadoop-yarn* 
2)  Monitor system resource utilization and dynamically adjust resource of 
Opportunistic_Group
3) Kill container only when adjust resource fail for given times

> ContainersMonitor support configurable
> --------------------------------------
>
>                 Key: YARN-7693
>                 URL: https://issues.apache.org/jira/browse/YARN-7693
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Jiandan Yang 
>            Assignee: Jiandan Yang 
>            Priority: Minor
>         Attachments: YARN-7693.001.patch, YARN-7693.002.patch
>
>
> Currently ContainersMonitor has only one default implementation 
> ContainersMonitorImpl,
> After introducing Opportunistic Container, ContainersMonitor needs to monitor 
> system metrics and even dynamically adjust Opportunistic and Guaranteed 
> resources in the cgroup, so another ContainersMonitor may need to be 
> implemented. 
> The current ContainerManagerImpl ContainersMonitorImpl direct new 
> ContainerManagerImpl, so ContainersMonitor need to be configurable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to