[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1856: Attachment: YARN-1856.004.patch {quote} Should add all the configs to yarn-default.xml, saying they are still early configs? {quote} I don't think we've figured out how to specify the various resource isolation pieces from a config perspective. I'd like to keep to private for now and I'll file a follow up JIRA to document the configs once we've figured it out. The remaining points all relate to this so I'll address them when as part of that JIRA. {quote} ResourceHandlerModule - Formatting of new code is a little off: the declaration of getCgroupsMemoryResourceHandler(). There are other occurrences like this in that class before in this patch, you may want to fix those. {quote} Fixed. {quote} BUG! getCgroupsMemoryResourceHandler() incorrectly locks DiskResourceHandler instead of MemoryResourceHandler. CGroupsMemoryResourceHandlerImpl {quote} Not a bug! But a *bad* typo nonetheless. Fixed {quote} What is this doing? {{ CGroupsHandler.CGroupController MEMORY = CGroupsHandler.CGroupController.MEMORY; }} Is it forcing a class-load or something? Not sure if this is needed. If this is needed, you may want to add a comment here. {quote} No just a shorthand instead of specifying the entire qualified variable every time. {quote} NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC -> NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERCENTAGE. Similarly the default constant. {quote} Fixed. {quote} CGROUP_PARAM_MEMORY_HARD_LIMIT_BYTES / CGROUP_PARAM_MEMORY_SOFT_LIMIT_BYTES / CGROUP_PARAM_MEMORY_SWAPPINESS can all be static and final. {quote} Interface variables are public static final by default. Any reason you want to add static final? > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1856: Attachment: YARN-1856.003.patch bq. The constants should have ‘MEMORY’ in their names. For example, CGROUP_PARAM_HARD_LIMIT is better named as CGROUP_PARAM_MEMORY_HARD_LIMIT in order to avoid future collisions. This is similar to how BLKIO is used in the previous line (classid should be fixed at some point too) Fixed. bq. Is the soft limit conf setting meant to represent a percentage or is it a fraction between 0 and 1? From the default value of 0.9f and the application of the soft limit it appears to be a fraction, but the name of the setting and its validation check seem to indicate that it is meant to be a percentage. This needs to be fixed. Good catch. It's meant to be a percentage. Fixed. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1856: Attachment: YARN-1856.002.patch bq. Could you move these constants to CGroupsHandler ? There are already some cgroups parameter constants defined there. Fixed. bq. IMO, If the behavior here is unpredictable, we should simply error our here in case both are enabled. Fixed. bq. We should make the fraction configurable, I think. What are the implications of the soft limit? Fixed. It's a lower limit useful when there's memory contention. bq. Since we are skipping changes to yarn-default.xml (based on changes I see in TestYarnConfigurationFields), these should be marked \@Private , similar to how network/disk configs settings are annotated? Fixed. bq. Thinking aloud here : should we add support in some form for memory.oom_control and notifications/stats? We should take that up as a separate JIRA. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1856: Attachment: YARN-1856.001.patch Uploaded a patch that adds support for cgroups based memory monitoring. I found that the default setting for swappiness results in a significant change in behaviour compared to the existing pmem monitor. I've added a configuration to let admins set the swappiness value, with the default being 0. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1856: Target Version/s: 2.7.3 (was: 2.6.3, 2.7.3) > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-1856: Target Version/s: (was: 2.7.3) > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1856: -- Target Version/s: 2.6.3, 2.7.3 (was: 2.7.2, 2.6.3) Moving out all non-critical / non-blocker issues that didn't make it out of 2.7.2 into 2.7.3. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-1856: -- Target Version/s: 2.7.2, 2.6.3 (was: 2.7.2, 2.6.2) Targeting 2.6.3 now that 2.6.2 has shipped. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-1856: -- Target Version/s: 2.7.2, 2.6.2 (was: 2.6.1, 2.7.2) > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-1856: -- Unless the patch is ready to go and the JIRA is a critical fix, we'll defer it to 2.6.2. Let me know if you have comments. Thanks! > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1856: -- Target Version/s: 2.6.1, 2.7.2 (was: 2.6.0) Moving all tickets targeted for the already closed release 2.6.0 into 2.6.1/2.7.2. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1856: --- Target Version/s: 2.6.0 (was: 2.5.0) > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)