[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled
[ https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063088#comment-17063088 ] Bilwa S T commented on YARN-10120: -- [~prabhujoseph] Thanks for the review. I have handled first six comments. 7th one i think already there is classname at the end of BASEDIR > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled > -- > > Key: YARN-10120 > URL: https://issues.apache.org/jira/browse/YARN-10120 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10120.001.patch, YARN-10120.002.patch > > > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled. > yarn.router.webapp.https.address =router ip:8091 > {noformat} > 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/apps > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at > com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at >
[jira] [Updated] (YARN-10201) Make AMRMProxyPolicy aware of SC load
[ https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen updated YARN-10201: -- Attachment: YARN-10201.v1.patch > Make AMRMProxyPolicy aware of SC load > - > > Key: YARN-10201 > URL: https://issues.apache.org/jira/browse/YARN-10201 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Young Chen >Assignee: Young Chen >Priority: Major > Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch > > > LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when > splitting resource requests. We propose changes to the policy so that it > receives feedback from SCs and can load balance requests across the federated > cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore
[ https://issues.apache.org/jira/browse/YARN-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063016#comment-17063016 ] Hadoop QA commented on YARN-10003: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 7m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 12s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} branch-3.2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} branch-3.2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}392m 52s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}458m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestParentQueue | | | hadoop.yarn.server.resourcemanager.scheduler.policy.TestFairOrderingPolicy | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocation | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerWithMultiResourceTypes | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority | | | hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerDynamicBehavior | | |
[jira] [Updated] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled
[ https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10120: - Attachment: YARN-10120.002.patch > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled > -- > > Key: YARN-10120 > URL: https://issues.apache.org/jira/browse/YARN-10120 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Reporter: Sushanta Sen >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-10120.001.patch, YARN-10120.002.patch > > > In Federation Router Nodes/Applications/About pages throws 500 exception when > https is enabled. > yarn.router.webapp.https.address =router ip:8091 > {noformat} > 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error > handling URI: /cluster/apps > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287) > at > com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277) > at > com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182) > at > com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875) > at > com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829) > at > com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82) > at > com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133) > at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130) > at > com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203) > at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) > at >
[jira] [Updated] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore
[ https://issues.apache.org/jira/browse/YARN-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke updated YARN-10003: - Attachment: YARN-10003.branch-3.2.POC002.patch > YarnConfigurationStore#checkVersion throws exception that belongs to > RMStateStore > - > > Key: YARN-10003 > URL: https://issues.apache.org/jira/browse/YARN-10003 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Benjamin Teke >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10003.001.patch, YARN-10003.002.patch, > YARN-10003.003.patch, YARN-10003.004.patch, YARN-10003.005.patch, > YARN-10003.branch-3.2.001.patch, YARN-10003.branch-3.2.POC001.patch, > YARN-10003.branch-3.2.POC002.patch > > > RMStateVersionIncompatibleException is thrown from method "checkVersion". > Moreover, there's a TODO here saying this method is copied from RMStateStore. > We should revise this method a bit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062750#comment-17062750 ] Manikandan R commented on YARN-10198: - [~pbacsko] Thanks for extending the patch. 1. Yes, Unlike other placement rules in FS, "SecondaryGroupExistingPlacementRule" expects Queue exist. Except "SecondaryGroupExistingPlacementRule", all other rules (even PrimaryGroupExistingPlacementRule) depends on "FSPlacementRule#createQueue" flag as well in conjunction with FSPlacementRule#configuredQueue. Modified patch is in line with above FS flow. Line No.161 can make use of getPrimaryGroup() method. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10179) Queue mapping based on group id passed through application tag
[ https://issues.apache.org/jira/browse/YARN-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10179: Attachment: YARN-10179.002.patch > Queue mapping based on group id passed through application tag > -- > > Key: YARN-10179 > URL: https://issues.apache.org/jira/browse/YARN-10179 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Szilard Nemeth >Assignee: Andras Gyori >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10179.001.patch, YARN-10179.002.patch > > > There are situations when the real submitting user differs from the user what > arrives to YARN. For example in case of a Hive application when Hive > impersonation is turned off, the hive queries will run as Hive user and the > mapping is done based on the user's group. > Unfortunately in this case YARN doesn't have any information about the real > user and there are cases when the customer may want to map these applications > to the real submitting user's queue (based on the group id) instead of the > Hive queue. > For these cases, if they would pass the group id (or name) in the application > tag we may read it and use it during the queue mapping, if that user has > rights to run on the real user's queue. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS
[ https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062565#comment-17062565 ] Gergely Pollak commented on YARN-9879: -- [~sunilg] The latest errors are flaky tests, we keep retriggering the jenkins job to get a green, but if you check the last 2 runs, completely different tests fail, for the same patch set. Also last patch only contains cosmetic and log message changes, so it shouldn't cause any issue, but trying to get a jenkins +1 > Allow multiple leaf queues with the same name in CS > --- > > Key: YARN-9879 > URL: https://issues.apache.org/jira/browse/YARN-9879 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: fs2cs > Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, > YARN-9879.014.patch, YARN-9879.POC001.patch, YARN-9879.POC002.patch, > YARN-9879.POC003.patch, YARN-9879.POC004.patch, YARN-9879.POC005.patch, > YARN-9879.POC006.patch, YARN-9879.POC007.patch, YARN-9879.POC008.patch, > YARN-9879.POC009.patch, YARN-9879.POC010.patch, YARN-9879.POC011.patch, > YARN-9879.POC012.patch, YARN-9879.POC013.patch > > > Currently the leaf queue's name must be unique regardless of its position in > the queue hierarchy. > Design doc and first proposal is being made, I'll attach it as soon as it's > done. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062531#comment-17062531 ] Peter Bacsko edited comment on YARN-10198 at 3/19/20, 12:39 PM: I uploaded patch v2. Couple of things: 1. The most important: I was reasoning about the validation in case of {{%secondary_group}} and based on the existing code, this cannot have a managed parent. The queue has to exist, see {{getSecondaryGroup()}}. This also seems to be in line with Fair Scheduler, where this placement rule is called "SecondaryGroupExistingPlacementRule". Please confirm this [~maniraj...@gmail.com], [~prabhujoseph]. 2. I had to do some refactor in {{UserGroupMappingPlacementRule}} because readability is becoming more of a concern with the added features. It will be addressed by YARN-10199 hopefully. 3. Added extra unit tests. Existing tests are not broken by this change (at least not the ones in {{TestUserGroupMappingPlacementRule}}). was (Author: pbacsko): I uploaded patch v2. Couple of things: 1. I was reasoning about the validation in case of {{%secondary_group}} and based on the existing code, this cannot have a managed parent. The queue has to exist, see {{getSecondaryGroup()}}. This also seems to be in line with Fair Scheduler, where this placement rule is called "SecondaryGroupExistingPlacementRule". 2. I had to do some refactor in {{UserGroupMappingPlacementRule}} because readability is becoming more of a concern with the added features. It will be addressed by YARN-10199 hopefully. 3. Added extra unit tests. Existing tests are not broken by this change (at least not the ones in {{TestUserGroupMappingPlacementRule}}). > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062531#comment-17062531 ] Peter Bacsko commented on YARN-10198: - I uploaded patch v2. Couple of things: 1. I was reasoning about the validation in case of {{%secondary_group}} and based on the existing code, this cannot have a managed parent. The queue has to exist, see {{getSecondaryGroup()}}. This also seems to be in line with Fair Scheduler, where this placement rule is called "SecondaryGroupExistingPlacementRule". 2. I had to do some refactor in {{UserGroupMappingPlacementRule}} because readability is becoming more of a concern with the added features. It will be addressed by YARN-10199 hopefully. 3. Added extra unit tests. Existing tests are not broken by this change (at least not the ones in {{TestUserGroupMappingPlacementRule}}). > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10198: Attachment: YARN-10198-002.patch > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10199) Simplify UserGroupMappingPlacementRule#getPlacementForUser
[ https://issues.apache.org/jira/browse/YARN-10199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10199: Attachment: YARN-10199.003.patch > Simplify UserGroupMappingPlacementRule#getPlacementForUser > -- > > Key: YARN-10199 > URL: https://issues.apache.org/jira/browse/YARN-10199 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Minor > Attachments: YARN-10199.001.patch, YARN-10199.002.patch, > YARN-10199.003.patch > > > The UserGroupMappingPlacementRule#getPlacementForUser method, which is mainly > responsible for queue naming, contains deeply nested branches. In order to > provide an extendable mapping logic, the branches could be flattened and > simplified. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10179) Queue mapping based on group id passed through application tag
[ https://issues.apache.org/jira/browse/YARN-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10179: Attachment: YARN-10179.001.patch > Queue mapping based on group id passed through application tag > -- > > Key: YARN-10179 > URL: https://issues.apache.org/jira/browse/YARN-10179 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Szilard Nemeth >Assignee: Andras Gyori >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10179.001.patch > > > There are situations when the real submitting user differs from the user what > arrives to YARN. For example in case of a Hive application when Hive > impersonation is turned off, the hive queries will run as Hive user and the > mapping is done based on the user's group. > Unfortunately in this case YARN doesn't have any information about the real > user and there are cases when the customer may want to map these applications > to the real submitting user's queue (based on the group id) instead of the > Hive queue. > For these cases, if they would pass the group id (or name) in the application > tag we may read it and use it during the queue mapping, if that user has > rights to run on the real user's queue. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10179) Queue mapping based on group id passed through application tag
[ https://issues.apache.org/jira/browse/YARN-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10179: Attachment: (was: YARN-10179.001.patch) > Queue mapping based on group id passed through application tag > -- > > Key: YARN-10179 > URL: https://issues.apache.org/jira/browse/YARN-10179 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Szilard Nemeth >Assignee: Andras Gyori >Priority: Major > Fix For: 3.3.0 > > > There are situations when the real submitting user differs from the user what > arrives to YARN. For example in case of a Hive application when Hive > impersonation is turned off, the hive queries will run as Hive user and the > mapping is done based on the user's group. > Unfortunately in this case YARN doesn't have any information about the real > user and there are cases when the customer may want to map these applications > to the real submitting user's queue (based on the group id) instead of the > Hive queue. > For these cases, if they would pass the group id (or name) in the application > tag we may read it and use it during the queue mapping, if that user has > rights to run on the real user's queue. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062438#comment-17062438 ] Prabhu Joseph commented on YARN-10160: -- Thanks [~sunilg] for reviewing. Have modified testcase to test the new change in [^YARN-10160-004.patch] . > Add auto queue creation related configs to > RMWebService#CapacitySchedulerQueueInfo > -- > > Key: YARN-10160 > URL: https://issues.apache.org/jira/browse/YARN-10160 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, > YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, > YARN-10160-004.patch > > > Add auto queue creation related configs to > RMWebService#CapacitySchedulerQueueInfo. > {code} > yarn.scheduler.capacity..auto-create-child-queue.enabled > yarn.scheduler.capacity..leaf-queue-template. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10160: - Attachment: YARN-10160-004.patch > Add auto queue creation related configs to > RMWebService#CapacitySchedulerQueueInfo > -- > > Key: YARN-10160 > URL: https://issues.apache.org/jira/browse/YARN-10160 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, > YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, > YARN-10160-004.patch > > > Add auto queue creation related configs to > RMWebService#CapacitySchedulerQueueInfo. > {code} > yarn.scheduler.capacity..auto-create-child-queue.enabled > yarn.scheduler.capacity..leaf-queue-template. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10034) Allocation tags are not removed when node decommission
[ https://issues.apache.org/jira/browse/YARN-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062421#comment-17062421 ] kyungwan nam commented on YARN-10034: - [~prabhujoseph], [~adam.antal] Thank you for the review and commit! > Allocation tags are not removed when node decommission > -- > > Key: YARN-10034 > URL: https://issues.apache.org/jira/browse/YARN-10034 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10034.001.patch, YARN-10034.002.patch, > YARN-10034.003.patch > > > When a node is decommissioned, allocation tags that are attached to the node > are not removed. > I could see that allocation tags are revived when recommissioning the node. > RM removes allocation tags only if NM confirms the container releases by > YARN-8511. but, decommissioned NM does not connect to RM anymore. > Once a node is decommissioned, allocation tags that attached to the node > should be removed immediately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062414#comment-17062414 ] Peter Bacsko commented on YARN-10198: - [~sunilg] the patch needs to be extended. In its current form, it's not enough. I'm going to attach a new version today. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9088) Non-exclusive labels break QueueMetrics
[ https://issues.apache.org/jira/browse/YARN-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062390#comment-17062390 ] Anuj commented on YARN-9088: We are in our setup facing similar issue in which global view of pending and available resource is get messed up. > Non-exclusive labels break QueueMetrics > --- > > Key: YARN-9088 > URL: https://issues.apache.org/jira/browse/YARN-9088 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, resourcemanager >Affects Versions: 2.8.5 >Reporter: Brandon Scheller >Priority: Major > Labels: metrics, nodelabel > > QueueMetrics are broken (random/negative values) when non-exclusive labels > are being used and unlabeled containers run on labeled nodes. > This is caused by the change in the patch here: > https://issues.apache.org/jira/browse/YARN-6467 > It assumes that a container's label will be the same as the node's label that > it is running on. > If you look within the patch, sometimes metrics are updated using the > request.getNodeLabelExpression(). And sometimes they are updated using > node.getPartition(). > This means that in the case where the node is labeled while the container > request isn't, these metrics only get updated when referring to the default > queue. This stops metrics from balancing out and results in incorrect and > negative values in QueueMetrics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10034) Allocation tags are not removed when node decommission
[ https://issues.apache.org/jira/browse/YARN-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062349#comment-17062349 ] Hudson commented on YARN-10034: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18066 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/18066/]) YARN-10034. Remove Allocation Tags from released container from (pjoseph: rev f2d3ac2a3f27a849e00f529c5c2df6ef0bd82911) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java > Allocation tags are not removed when node decommission > -- > > Key: YARN-10034 > URL: https://issues.apache.org/jira/browse/YARN-10034 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-10034.001.patch, YARN-10034.002.patch, > YARN-10034.003.patch > > > When a node is decommissioned, allocation tags that are attached to the node > are not removed. > I could see that allocation tags are revived when recommissioning the node. > RM removes allocation tags only if NM confirms the container releases by > YARN-8511. but, decommissioned NM does not connect to RM anymore. > Once a node is decommissioned, allocation tags that attached to the node > should be removed immediately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10034) Allocation tags are not removed when node decommission
[ https://issues.apache.org/jira/browse/YARN-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062343#comment-17062343 ] Prabhu Joseph commented on YARN-10034: -- Thanks [~kyungwan nam] for the patch and [~adam.antal] for the review. Have committed the [^YARN-10034.003.patch] to trunk. > Allocation tags are not removed when node decommission > -- > > Key: YARN-10034 > URL: https://issues.apache.org/jira/browse/YARN-10034 > Project: Hadoop YARN > Issue Type: Bug >Reporter: kyungwan nam >Assignee: kyungwan nam >Priority: Major > Attachments: YARN-10034.001.patch, YARN-10034.002.patch, > YARN-10034.003.patch > > > When a node is decommissioned, allocation tags that are attached to the node > are not removed. > I could see that allocation tags are revived when recommissioning the node. > RM removes allocation tags only if NM confirms the container releases by > YARN-8511. but, decommissioned NM does not connect to RM anymore. > Once a node is decommissioned, allocation tags that attached to the node > should be removed immediately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo
[ https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062322#comment-17062322 ] Sunil G commented on YARN-10160: [~prabhujoseph] pls add test cases for the new change. > Add auto queue creation related configs to > RMWebService#CapacitySchedulerQueueInfo > -- > > Key: YARN-10160 > URL: https://issues.apache.org/jira/browse/YARN-10160 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, > YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch > > > Add auto queue creation related configs to > RMWebService#CapacitySchedulerQueueInfo. > {code} > yarn.scheduler.capacity..auto-create-child-queue.enabled > yarn.scheduler.capacity..leaf-queue-template. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062320#comment-17062320 ] Sunil G commented on YARN-10198: [~prabhujoseph] [~maniraj...@gmail.com] [~pbacsko] Do we have a consensus in this attached patch and the approach. I think patch looks fine. Thoughts? > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
[ https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062318#comment-17062318 ] Sunil G commented on YARN-10194: +1 for this change. I ll commit shortly > YARN RMWebServices /scheduler-conf/validate leaks ZK Connections > > > Key: YARN-10194 > URL: https://issues.apache.org/jira/browse/YARN-10194 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.3.0 >Reporter: Akhil PB >Assignee: Prabhu Joseph >Priority: Critical > Attachments: YARN-10194-001.patch > > > YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation > API creates a new CapacityScheduler and missed to close after the validation. > Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens > ZKConfigurationStore and creates a ZK Connection. > *ZK LOGS* > {code} > -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 > times] Error accepting new connection: Too many connections from > /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,449 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,710 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,876 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:53,068 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:53,391 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,008 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,287 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,483 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > {code} > And there is an another bug in ZKConfigurationStore which has not handled > close() of ZKCuratorManager. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS
[ https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062315#comment-17062315 ] Sunil G commented on YARN-9879: --- [~shuzirra] cud pls check latest errors. > Allow multiple leaf queues with the same name in CS > --- > > Key: YARN-9879 > URL: https://issues.apache.org/jira/browse/YARN-9879 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: fs2cs > Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, > YARN-9879.014.patch, YARN-9879.POC001.patch, YARN-9879.POC002.patch, > YARN-9879.POC003.patch, YARN-9879.POC004.patch, YARN-9879.POC005.patch, > YARN-9879.POC006.patch, YARN-9879.POC007.patch, YARN-9879.POC008.patch, > YARN-9879.POC009.patch, YARN-9879.POC010.patch, YARN-9879.POC011.patch, > YARN-9879.POC012.patch, YARN-9879.POC013.patch > > > Currently the leaf queue's name must be unique regardless of its position in > the queue hierarchy. > Design doc and first proposal is being made, I'll attach it as soon as it's > done. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org