[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2020-03-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063088#comment-17063088
 ] 

Bilwa S T commented on YARN-10120:
--

[~prabhujoseph] Thanks for the review. I have handled first six comments. 7th 
one i think already there is classname at the end of BASEDIR

> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled
> --
>
> Key: YARN-10120
> URL: https://issues.apache.org/jira/browse/YARN-10120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-10120.001.patch, YARN-10120.002.patch
>
>
> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled.
> yarn.router.webapp.https.address =router ip:8091
> {noformat}
> 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster/apps
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> 

[jira] [Updated] (YARN-10201) Make AMRMProxyPolicy aware of SC load

2020-03-19 Thread Young Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Young Chen updated YARN-10201:
--
Attachment: YARN-10201.v1.patch

> Make AMRMProxyPolicy aware of SC load
> -
>
> Key: YARN-10201
> URL: https://issues.apache.org/jira/browse/YARN-10201
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Young Chen
>Assignee: Young Chen
>Priority: Major
> Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch
>
>
> LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when 
> splitting resource requests. We propose changes to the policy so that it 
> receives feedback from SCs and can load balance requests across the federated 
> cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore

2020-03-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063016#comment-17063016
 ] 

Hadoop QA commented on YARN-10003:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  7m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
12s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}392m 52s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}458m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector 
|
|   | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestParentQueue |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.policy.TestFairOrderingPolicy |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerWithMultiResourceTypes
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
|   | hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerDynamicBehavior
 |
|   | 

[jira] [Updated] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2020-03-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10120:
-
Attachment: YARN-10120.002.patch

> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled
> --
>
> Key: YARN-10120
> URL: https://issues.apache.org/jira/browse/YARN-10120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-10120.001.patch, YARN-10120.002.patch
>
>
> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled.
> yarn.router.webapp.https.address =router ip:8091
> {noformat}
> 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster/apps
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>   at 
> 

[jira] [Updated] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore

2020-03-19 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10003:
-
Attachment: YARN-10003.branch-3.2.POC002.patch

> YarnConfigurationStore#checkVersion throws exception that belongs to 
> RMStateStore
> -
>
> Key: YARN-10003
> URL: https://issues.apache.org/jira/browse/YARN-10003
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10003.001.patch, YARN-10003.002.patch, 
> YARN-10003.003.patch, YARN-10003.004.patch, YARN-10003.005.patch, 
> YARN-10003.branch-3.2.001.patch, YARN-10003.branch-3.2.POC001.patch, 
> YARN-10003.branch-3.2.POC002.patch
>
>
> RMStateVersionIncompatibleException is thrown from method "checkVersion".
> Moreover, there's a TODO here saying this method is copied from RMStateStore. 
> We should revise this method a bit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868

2020-03-19 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062750#comment-17062750
 ] 

Manikandan R commented on YARN-10198:
-

[~pbacsko] Thanks for extending the patch.

1. Yes, Unlike other placement rules in FS, 
"SecondaryGroupExistingPlacementRule" expects Queue exist. Except 
"SecondaryGroupExistingPlacementRule", all other rules (even 
PrimaryGroupExistingPlacementRule) depends on "FSPlacementRule#createQueue" 
flag as well in conjunction with FSPlacementRule#configuredQueue.

Modified patch is in line with above FS flow.

Line No.161 can make use of getPrimaryGroup() method.


> [managedParent].%primary_group mapping rule doesn't work after YARN-9868
> 
>
> Key: YARN-10198
> URL: https://issues.apache.org/jira/browse/YARN-10198
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10198-001.patch, YARN-10198-002.patch
>
>
> YARN-9868 introduced an unnecessary check if we have the following placement 
> rule:
> [managedParentQueue].%primary_group
> Here, {{%primary_group}} is expected to be created if it doesn't exist. 
> However, there is this validation code which is not necessary:
> {noformat}
>   } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
> if (this.queueManager
> .getQueue(groups.getGroups(user).get(0)) != null) {
>   return getPlacementContext(mapping,
>   groups.getGroups(user).get(0));
> } else {
>   return null;
> }
> {noformat}
> We should revert this part to the original version:
> {noformat}
>   } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
> return getPlacementContext(mapping, 
> groups.getGroups(user).get(0));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10179) Queue mapping based on group id passed through application tag

2020-03-19 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10179:

Attachment: YARN-10179.002.patch

> Queue mapping based on group id passed through application tag
> --
>
> Key: YARN-10179
> URL: https://issues.apache.org/jira/browse/YARN-10179
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Szilard Nemeth
>Assignee: Andras Gyori
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10179.001.patch, YARN-10179.002.patch
>
>
> There are situations when the real submitting user differs from the user what 
> arrives to YARN. For example in case of a Hive application when Hive 
> impersonation is turned off, the hive queries will run as Hive user and the 
> mapping is done based on the user's group. 
> Unfortunately in this case YARN doesn't have any information about the real 
> user and there are cases when the customer may want to map these applications 
> to the real submitting user's queue (based on the group id) instead of the 
> Hive queue.
> For these cases, if they would pass the group id (or name) in the application 
> tag we may read it and use it during the queue mapping, if that user has 
> rights to run on the real user's queue.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-03-19 Thread Gergely Pollak (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062565#comment-17062565
 ] 

Gergely Pollak commented on YARN-9879:
--

[~sunilg] The latest errors are flaky tests, we keep retriggering the jenkins 
job to get a green, but if you check the last 2 runs, completely different 
tests fail, for the same patch set.  Also last patch only contains cosmetic and 
log message changes, so it shouldn't cause any issue, but trying to get a 
jenkins +1

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>  Labels: fs2cs
> Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, 
> YARN-9879.014.patch, YARN-9879.POC001.patch, YARN-9879.POC002.patch, 
> YARN-9879.POC003.patch, YARN-9879.POC004.patch, YARN-9879.POC005.patch, 
> YARN-9879.POC006.patch, YARN-9879.POC007.patch, YARN-9879.POC008.patch, 
> YARN-9879.POC009.patch, YARN-9879.POC010.patch, YARN-9879.POC011.patch, 
> YARN-9879.POC012.patch, YARN-9879.POC013.patch
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868

2020-03-19 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062531#comment-17062531
 ] 

Peter Bacsko edited comment on YARN-10198 at 3/19/20, 12:39 PM:


I uploaded patch v2.

Couple of things:
1. The most important: I was reasoning about the validation in case of 
{{%secondary_group}} and based on the existing code, this cannot have a managed 
parent. The queue has to exist, see {{getSecondaryGroup()}}. This also seems to 
be in line with Fair Scheduler, where this placement rule is called 
"SecondaryGroupExistingPlacementRule". Please confirm this 
[~maniraj...@gmail.com], [~prabhujoseph].

2. I had to do some refactor in {{UserGroupMappingPlacementRule}} because 
readability is becoming more of a concern with the added features. It will be 
addressed by YARN-10199 hopefully.

3. Added extra unit tests. Existing tests are not broken by this change (at 
least not the ones in {{TestUserGroupMappingPlacementRule}}).


was (Author: pbacsko):
I uploaded patch v2.

Couple of things:
1. I was reasoning about the validation in case of {{%secondary_group}} and 
based on the existing code, this cannot have a managed parent. The queue has to 
exist, see {{getSecondaryGroup()}}. This also seems to be in line with Fair 
Scheduler, where this placement rule is called 
"SecondaryGroupExistingPlacementRule".

2. I had to do some refactor in {{UserGroupMappingPlacementRule}} because 
readability is becoming more of a concern with the added features. It will be 
addressed by YARN-10199 hopefully.

3. Added extra unit tests. Existing tests are not broken by this change (at 
least not the ones in {{TestUserGroupMappingPlacementRule}}).

> [managedParent].%primary_group mapping rule doesn't work after YARN-9868
> 
>
> Key: YARN-10198
> URL: https://issues.apache.org/jira/browse/YARN-10198
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10198-001.patch, YARN-10198-002.patch
>
>
> YARN-9868 introduced an unnecessary check if we have the following placement 
> rule:
> [managedParentQueue].%primary_group
> Here, {{%primary_group}} is expected to be created if it doesn't exist. 
> However, there is this validation code which is not necessary:
> {noformat}
>   } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
> if (this.queueManager
> .getQueue(groups.getGroups(user).get(0)) != null) {
>   return getPlacementContext(mapping,
>   groups.getGroups(user).get(0));
> } else {
>   return null;
> }
> {noformat}
> We should revert this part to the original version:
> {noformat}
>   } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
> return getPlacementContext(mapping, 
> groups.getGroups(user).get(0));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868

2020-03-19 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062531#comment-17062531
 ] 

Peter Bacsko commented on YARN-10198:
-

I uploaded patch v2.

Couple of things:
1. I was reasoning about the validation in case of {{%secondary_group}} and 
based on the existing code, this cannot have a managed parent. The queue has to 
exist, see {{getSecondaryGroup()}}. This also seems to be in line with Fair 
Scheduler, where this placement rule is called 
"SecondaryGroupExistingPlacementRule".

2. I had to do some refactor in {{UserGroupMappingPlacementRule}} because 
readability is becoming more of a concern with the added features. It will be 
addressed by YARN-10199 hopefully.

3. Added extra unit tests. Existing tests are not broken by this change (at 
least not the ones in {{TestUserGroupMappingPlacementRule}}).

> [managedParent].%primary_group mapping rule doesn't work after YARN-9868
> 
>
> Key: YARN-10198
> URL: https://issues.apache.org/jira/browse/YARN-10198
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10198-001.patch, YARN-10198-002.patch
>
>
> YARN-9868 introduced an unnecessary check if we have the following placement 
> rule:
> [managedParentQueue].%primary_group
> Here, {{%primary_group}} is expected to be created if it doesn't exist. 
> However, there is this validation code which is not necessary:
> {noformat}
>   } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
> if (this.queueManager
> .getQueue(groups.getGroups(user).get(0)) != null) {
>   return getPlacementContext(mapping,
>   groups.getGroups(user).get(0));
> } else {
>   return null;
> }
> {noformat}
> We should revert this part to the original version:
> {noformat}
>   } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
> return getPlacementContext(mapping, 
> groups.getGroups(user).get(0));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868

2020-03-19 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10198:

Attachment: YARN-10198-002.patch

> [managedParent].%primary_group mapping rule doesn't work after YARN-9868
> 
>
> Key: YARN-10198
> URL: https://issues.apache.org/jira/browse/YARN-10198
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10198-001.patch, YARN-10198-002.patch
>
>
> YARN-9868 introduced an unnecessary check if we have the following placement 
> rule:
> [managedParentQueue].%primary_group
> Here, {{%primary_group}} is expected to be created if it doesn't exist. 
> However, there is this validation code which is not necessary:
> {noformat}
>   } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
> if (this.queueManager
> .getQueue(groups.getGroups(user).get(0)) != null) {
>   return getPlacementContext(mapping,
>   groups.getGroups(user).get(0));
> } else {
>   return null;
> }
> {noformat}
> We should revert this part to the original version:
> {noformat}
>   } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
> return getPlacementContext(mapping, 
> groups.getGroups(user).get(0));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10199) Simplify UserGroupMappingPlacementRule#getPlacementForUser

2020-03-19 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10199:

Attachment: YARN-10199.003.patch

> Simplify UserGroupMappingPlacementRule#getPlacementForUser
> --
>
> Key: YARN-10199
> URL: https://issues.apache.org/jira/browse/YARN-10199
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Minor
> Attachments: YARN-10199.001.patch, YARN-10199.002.patch, 
> YARN-10199.003.patch
>
>
> The UserGroupMappingPlacementRule#getPlacementForUser method, which is mainly 
> responsible for queue naming, contains deeply nested branches. In order to 
> provide an extendable mapping logic, the branches could be flattened and 
> simplified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10179) Queue mapping based on group id passed through application tag

2020-03-19 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10179:

Attachment: YARN-10179.001.patch

> Queue mapping based on group id passed through application tag
> --
>
> Key: YARN-10179
> URL: https://issues.apache.org/jira/browse/YARN-10179
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Szilard Nemeth
>Assignee: Andras Gyori
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10179.001.patch
>
>
> There are situations when the real submitting user differs from the user what 
> arrives to YARN. For example in case of a Hive application when Hive 
> impersonation is turned off, the hive queries will run as Hive user and the 
> mapping is done based on the user's group. 
> Unfortunately in this case YARN doesn't have any information about the real 
> user and there are cases when the customer may want to map these applications 
> to the real submitting user's queue (based on the group id) instead of the 
> Hive queue.
> For these cases, if they would pass the group id (or name) in the application 
> tag we may read it and use it during the queue mapping, if that user has 
> rights to run on the real user's queue.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10179) Queue mapping based on group id passed through application tag

2020-03-19 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10179:

Attachment: (was: YARN-10179.001.patch)

> Queue mapping based on group id passed through application tag
> --
>
> Key: YARN-10179
> URL: https://issues.apache.org/jira/browse/YARN-10179
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Reporter: Szilard Nemeth
>Assignee: Andras Gyori
>Priority: Major
> Fix For: 3.3.0
>
>
> There are situations when the real submitting user differs from the user what 
> arrives to YARN. For example in case of a Hive application when Hive 
> impersonation is turned off, the hive queries will run as Hive user and the 
> mapping is done based on the user's group. 
> Unfortunately in this case YARN doesn't have any information about the real 
> user and there are cases when the customer may want to map these applications 
> to the real submitting user's queue (based on the group id) instead of the 
> Hive queue.
> For these cases, if they would pass the group id (or name) in the application 
> tag we may read it and use it during the queue mapping, if that user has 
> rights to run on the real user's queue.  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-03-19 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062438#comment-17062438
 ] 

Prabhu Joseph commented on YARN-10160:
--

Thanks [~sunilg] for reviewing. Have modified testcase to test the new change 
in  [^YARN-10160-004.patch] . 

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-03-19 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10160:
-
Attachment: YARN-10160-004.patch

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10034) Allocation tags are not removed when node decommission

2020-03-19 Thread kyungwan nam (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062421#comment-17062421
 ] 

kyungwan nam commented on YARN-10034:
-

[~prabhujoseph], [~adam.antal]
Thank you for the review and commit!

> Allocation tags are not removed when node decommission
> --
>
> Key: YARN-10034
> URL: https://issues.apache.org/jira/browse/YARN-10034
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10034.001.patch, YARN-10034.002.patch, 
> YARN-10034.003.patch
>
>
> When a node is decommissioned, allocation tags that are attached to the node 
> are not removed.
> I could see that allocation tags are revived when recommissioning the node.
> RM removes allocation tags only if NM confirms the container releases by 
> YARN-8511. but, decommissioned NM does not connect to RM anymore.
> Once a node is decommissioned, allocation tags that attached to the node 
> should be removed immediately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868

2020-03-19 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062414#comment-17062414
 ] 

Peter Bacsko commented on YARN-10198:
-

[~sunilg] the patch needs to be extended. In its current form, it's not enough. 
I'm going to attach a new version today.

> [managedParent].%primary_group mapping rule doesn't work after YARN-9868
> 
>
> Key: YARN-10198
> URL: https://issues.apache.org/jira/browse/YARN-10198
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10198-001.patch
>
>
> YARN-9868 introduced an unnecessary check if we have the following placement 
> rule:
> [managedParentQueue].%primary_group
> Here, {{%primary_group}} is expected to be created if it doesn't exist. 
> However, there is this validation code which is not necessary:
> {noformat}
>   } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
> if (this.queueManager
> .getQueue(groups.getGroups(user).get(0)) != null) {
>   return getPlacementContext(mapping,
>   groups.getGroups(user).get(0));
> } else {
>   return null;
> }
> {noformat}
> We should revert this part to the original version:
> {noformat}
>   } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
> return getPlacementContext(mapping, 
> groups.getGroups(user).get(0));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9088) Non-exclusive labels break QueueMetrics

2020-03-19 Thread Anuj (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062390#comment-17062390
 ] 

Anuj commented on YARN-9088:


We are in our setup facing similar issue in which global view of pending and 
available resource is get messed up.

> Non-exclusive labels break QueueMetrics
> ---
>
> Key: YARN-9088
> URL: https://issues.apache.org/jira/browse/YARN-9088
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, resourcemanager
>Affects Versions: 2.8.5
>Reporter: Brandon Scheller
>Priority: Major
>  Labels: metrics, nodelabel
>
> QueueMetrics are broken (random/negative values) when non-exclusive labels 
> are being used and unlabeled containers run on labeled nodes.
> This is caused by the change in the patch here:
> https://issues.apache.org/jira/browse/YARN-6467
> It assumes that a container's label will be the same as the node's label that 
> it is running on.
> If you look within the patch, sometimes metrics are updated using the 
> request.getNodeLabelExpression(). And sometimes they are updated using 
> node.getPartition().
> This means that in the case where the node is labeled while the container 
> request isn't, these metrics only get updated when referring to the default 
> queue. This stops metrics from balancing out and results in incorrect and 
> negative values in QueueMetrics. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10034) Allocation tags are not removed when node decommission

2020-03-19 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062349#comment-17062349
 ] 

Hudson commented on YARN-10034:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18066 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18066/])
YARN-10034. Remove Allocation Tags from released container from (pjoseph: rev 
f2d3ac2a3f27a849e00f529c5c2df6ef0bd82911)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java


> Allocation tags are not removed when node decommission
> --
>
> Key: YARN-10034
> URL: https://issues.apache.org/jira/browse/YARN-10034
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10034.001.patch, YARN-10034.002.patch, 
> YARN-10034.003.patch
>
>
> When a node is decommissioned, allocation tags that are attached to the node 
> are not removed.
> I could see that allocation tags are revived when recommissioning the node.
> RM removes allocation tags only if NM confirms the container releases by 
> YARN-8511. but, decommissioned NM does not connect to RM anymore.
> Once a node is decommissioned, allocation tags that attached to the node 
> should be removed immediately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10034) Allocation tags are not removed when node decommission

2020-03-19 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062343#comment-17062343
 ] 

Prabhu Joseph commented on YARN-10034:
--

Thanks [~kyungwan nam] for the patch and [~adam.antal] for the review.

Have committed the  [^YARN-10034.003.patch]  to trunk.

> Allocation tags are not removed when node decommission
> --
>
> Key: YARN-10034
> URL: https://issues.apache.org/jira/browse/YARN-10034
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Major
> Attachments: YARN-10034.001.patch, YARN-10034.002.patch, 
> YARN-10034.003.patch
>
>
> When a node is decommissioned, allocation tags that are attached to the node 
> are not removed.
> I could see that allocation tags are revived when recommissioning the node.
> RM removes allocation tags only if NM confirms the container releases by 
> YARN-8511. but, decommissioned NM does not connect to RM anymore.
> Once a node is decommissioned, allocation tags that attached to the node 
> should be removed immediately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-03-19 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062322#comment-17062322
 ] 

Sunil G commented on YARN-10160:


[~prabhujoseph] pls add test cases for the new change. 

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868

2020-03-19 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062320#comment-17062320
 ] 

Sunil G commented on YARN-10198:


[~prabhujoseph] [~maniraj...@gmail.com] [~pbacsko] Do we have a consensus in 
this attached patch and the approach.

I think patch looks fine. Thoughts?

> [managedParent].%primary_group mapping rule doesn't work after YARN-9868
> 
>
> Key: YARN-10198
> URL: https://issues.apache.org/jira/browse/YARN-10198
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-10198-001.patch
>
>
> YARN-9868 introduced an unnecessary check if we have the following placement 
> rule:
> [managedParentQueue].%primary_group
> Here, {{%primary_group}} is expected to be created if it doesn't exist. 
> However, there is this validation code which is not necessary:
> {noformat}
>   } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) {
> if (this.queueManager
> .getQueue(groups.getGroups(user).get(0)) != null) {
>   return getPlacementContext(mapping,
>   groups.getGroups(user).get(0));
> } else {
>   return null;
> }
> {noformat}
> We should revert this part to the original version:
> {noformat}
>   } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) {
> return getPlacementContext(mapping, 
> groups.getGroups(user).get(0));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

2020-03-19 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062318#comment-17062318
 ] 

Sunil G commented on YARN-10194:


+1 for this change.

I ll commit shortly

> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
> 
>
> Key: YARN-10194
> URL: https://issues.apache.org/jira/browse/YARN-10194
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Akhil PB
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-10194-001.patch
>
>
> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation 
> API creates a new CapacityScheduler and missed to close after the validation. 
> Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens 
> ZKConfigurationStore and creates a ZK Connection. 
> *ZK LOGS*
> {code}
> -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 
> times] Error accepting new connection: Too many connections from 
> /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,449 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,710 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,876 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,068 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,391 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,008 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,287 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,483 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> {code}
> And there is an another bug in ZKConfigurationStore which has not handled 
> close() of ZKCuratorManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-03-19 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17062315#comment-17062315
 ] 

Sunil G commented on YARN-9879:
---

[~shuzirra] cud pls check latest errors.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>  Labels: fs2cs
> Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, 
> YARN-9879.014.patch, YARN-9879.POC001.patch, YARN-9879.POC002.patch, 
> YARN-9879.POC003.patch, YARN-9879.POC004.patch, YARN-9879.POC005.patch, 
> YARN-9879.POC006.patch, YARN-9879.POC007.patch, YARN-9879.POC008.patch, 
> YARN-9879.POC009.patch, YARN-9879.POC010.patch, YARN-9879.POC011.patch, 
> YARN-9879.POC012.patch, YARN-9879.POC013.patch
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org