[jira] [Commented] (YARN-7620) Allow node partition filters on Queues page of new YARN UI
[ https://issues.apache.org/jira/browse/YARN-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325972#comment-16325972 ] ASF GitHub Bot commented on YARN-7620: -- Github user skmvasu commented on the issue: https://github.com/apache/hadoop/pull/310 This is closed already > Allow node partition filters on Queues page of new YARN UI > -- > > Key: YARN-7620 > URL: https://issues.apache.org/jira/browse/YARN-7620 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Vasudevan Skm >Assignee: Vasudevan Skm >Priority: Major > Fix For: 3.1.0 > > Attachments: YARN-7620.001.patch, YARN-7620.002.patch, > YARN-7620.003.patch, YARN-7620.004.patch > > > Allow users their queues based on node labels -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7620) Allow node partition filters on Queues page of new YARN UI
[ https://issues.apache.org/jira/browse/YARN-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325973#comment-16325973 ] ASF GitHub Bot commented on YARN-7620: -- Github user skmvasu closed the pull request at: https://github.com/apache/hadoop/pull/310 > Allow node partition filters on Queues page of new YARN UI > -- > > Key: YARN-7620 > URL: https://issues.apache.org/jira/browse/YARN-7620 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-ui-v2 >Reporter: Vasudevan Skm >Assignee: Vasudevan Skm >Priority: Major > Fix For: 3.1.0 > > Attachments: YARN-7620.001.patch, YARN-7620.002.patch, > YARN-7620.003.patch, YARN-7620.004.patch > > > Allow users their queues based on node labels -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325961#comment-16325961 ] genericqa commented on YARN-6619: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} YARN-6592 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 52s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 18s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 20s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} YARN-6592 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} YARN-6592 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 19s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 58s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 28 new + 205 unchanged - 8 fixed = 233 total (was 213) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 0s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 33s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.api.impl.TestOpportunisticContainerAllocationE2E | | | hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 | | JIRA Issue | YARN-6619 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12906060/YARN-6619-YARN-6592.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 23c8f2ebc664 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-7479) TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk
[ https://issues.apache.org/jira/browse/YARN-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325953#comment-16325953 ] Akira Ajisaka commented on YARN-7479: - Thanks [~rkanter] for the review. Changed the interval to 500ms. > TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk > > > Key: YARN-7479 > URL: https://issues.apache.org/jira/browse/YARN-7479 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Botong Huang >Assignee: Akira Ajisaka >Priority: Major > Attachments: YARN-7479.001.patch, YARN-7479.002.patch, > YARN-7479.003.patch, YARN-7479.004.patch > > > Was waiting for container_1_0001_01_00 to get to state COMPLETE but was > in state RUNNING after the timeout > java.lang.AssertionError: Was waiting for container_1_0001_01_00 to get > to state COMPLETE but was in state RUNNING after the timeout > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:431) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:171) > Pasting some exception message during test run here: > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119) > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Given NMToken for application : appattempt_1_0001_01 seems to have been > generated illegally. > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1437) > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Given NMToken for application : appattempt_1_0001_01 is not valid for > current node manager.expected : localhost:46649 found : InvalidHost:1234 > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1437) > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7479) TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk
[ https://issues.apache.org/jira/browse/YARN-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated YARN-7479: Attachment: YARN-7479.004.patch > TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk > > > Key: YARN-7479 > URL: https://issues.apache.org/jira/browse/YARN-7479 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Botong Huang >Assignee: Akira Ajisaka >Priority: Major > Attachments: YARN-7479.001.patch, YARN-7479.002.patch, > YARN-7479.003.patch, YARN-7479.004.patch > > > Was waiting for container_1_0001_01_00 to get to state COMPLETE but was > in state RUNNING after the timeout > java.lang.AssertionError: Was waiting for container_1_0001_01_00 to get > to state COMPLETE but was in state RUNNING after the timeout > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:431) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:171) > Pasting some exception message during test run here: > org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119) > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Given NMToken for application : appattempt_1_0001_01 seems to have been > generated illegally. > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1437) > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): > Given NMToken for application : appattempt_1_0001_01 is not valid for > current node manager.expected : localhost:46649 found : InvalidHost:1234 > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491) > at org.apache.hadoop.ipc.Client.call(Client.java:1437) > at org.apache.hadoop.ipc.Client.call(Client.java:1347) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7750) Render time in the users timezone
Vasudevan Skm created YARN-7750: --- Summary: Render time in the users timezone Key: YARN-7750 URL: https://issues.apache.org/jira/browse/YARN-7750 Project: Hadoop YARN Issue Type: Bug Components: yarn-ui-v2 Environment: Render time in the users timezone/ a predefined TZ Reporter: Vasudevan Skm Assignee: Vasudevan Skm -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7749) [UI2] GPU information tab in left hand side disappears when we click other tabs below
[ https://issues.apache.org/jira/browse/YARN-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325934#comment-16325934 ] ASF GitHub Bot commented on YARN-7749: -- Github user skmvasu commented on the issue: https://github.com/apache/hadoop/pull/327 @sunilgovind > [UI2] GPU information tab in left hand side disappears when we click other > tabs below > - > > Key: YARN-7749 > URL: https://issues.apache.org/jira/browse/YARN-7749 > Project: Hadoop YARN > Issue Type: Bug > Environment: {color:#33} {color} >Reporter: Sumana Sathish >Assignee: Vasudevan Skm >Priority: Major > > {color:#33}'GPU Information' tab on the left side of the Node Manager > Page disappears when we click 'List of applications' or 'List of Containers' > tab.{color} > {color:#33}Once we click on 'Node Information' tab, it reappears{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7749) [UI2] GPU information tab in left hand side disappears when we click other tabs below
Rohith Sharma K S created YARN-7749: --- Summary: [UI2] GPU information tab in left hand side disappears when we click other tabs below Key: YARN-7749 URL: https://issues.apache.org/jira/browse/YARN-7749 Project: Hadoop YARN Issue Type: Bug Environment: {color:#33} {color} Reporter: Sumana Sathish Assignee: Vasudevan Skm {color:#33}'GPU Information' tab on the left side of the Node Manager Page disappears when we click 'List of applications' or 'List of Containers' tab.{color} {color:#33}Once we click on 'Node Information' tab, it reappears{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks
[ https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325928#comment-16325928 ] Steven Rand edited comment on YARN-7655 at 1/15/18 6:49 AM: Thanks [~yufeigu] for taking a look. The cluster sizes and nodes should be pretty reasonable – for the three clusters I have in mind, the nodes are AWS ec2 instances with around 120 GB of RAM and around 20 vcores. The clusters range in size from double-digits to low triple-digits. That said, there is some configuration in place at these clusters which could explain high rates of AM preemption. Specifically: * The default max AM share is set to -1. Unfortunately the max AM share feature, while totally reasonable as far as I can tell, was causing a good deal of confusion when apps would fail to start for no apparent reason upon hitting the limit, and we disabled it in the hope that having one less variable would make the scheduler's behavior easier to understand. * The default fair share preemption threshold is set to 1.0. This was also an attempt to reduce confusion, as failure to preempt while below fair share (but above fair share * the threshold) was commonly misinterpreted as a bug. * The preemption timeouts for fair share and min share are also non-default – they're set to one second each. Possibly the configuration overrides, along with access patterns that include apps frequently starting up or increasing their demand via Spark's dynamic allocation feature, are the issue here, in which case we don't need to pursue this JIRA further. Data on whether or not other YARN deployments experience this issue would be useful, though not easy to come by, as I had to add custom logging to identify NODE_LOCAL requests as the cause of most AM preemptions at these clusters. was (Author: steven rand): Thanks [~yufeigu] for taking a look. The cluster sizes and nodes should be pretty reasonable -- for the three clusters I have in mind, the nodes are AWS ec2 instances with around 120 GB of RAM and around 20 vcores. The clusters range in size from double-digits to low triple-digits. That said, there is some configuration in place at these clusters which could explain high rates of AM preemption. Specifically: * The default max AM share is set to -1. Unfortunately the max AM share feature, while totally reasonable as far as I can tell, was causing a good deal of confusion when apps would fail to start for no apparently reason upon hitting the limit, and we disabled it in the hope that having one less variable would make the scheduler's behavior easier to understand. * The default fair share preemption threshold is set to 1.0. This was also an attempt to reduce confusion, as failure to preempt while below fair share (but above fair share * the threshold) was commonly misinterpreted as a bug. * The preemption timeouts for fair share and min share are also non-default -- they're set to one second each. Possibly the configuration overrides, along with access patterns that include apps frequently starting up or increasing their demand via Spark's dynamic allocation feature, are the issue here, in which case we don't need to pursue this JIRA further. Data on whether or not other YARN deployments experience this issue would be useful, though not easy to come by, as I had to add custom logging to identify NODE_LOCAL requests as the cause of most AM preemptions at these clusters. > avoid AM preemption caused by RRs for specific nodes or racks > - > > Key: YARN-7655 > URL: https://issues.apache.org/jira/browse/YARN-7655 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: Steven Rand >Assignee: Steven Rand >Priority: Major > Attachments: YARN-7655-001.patch > > > We frequently see AM preemptions when > {{starvedApp.getStarvedResourceRequests()}} in > {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs > that request containers on a specific node. Since this causes us to only > consider one node to preempt containers on, the really good work that was > done in YARN-5830 doesn't save us from AM preemption. Even though there might > be multiple nodes on which we could preempt enough non-AM containers to > satisfy the app's starvation, we often wind up preempting one or more AM > containers on the single node that we're considering. > A proposed solution is that if we're going to preempt one or more AM > containers for an RR that specifies a node or rack, then we should instead > expand the search space to consider all nodes. That way we take advantage of > YARN-5830, and only preempt AMs if there's no alternative. I've attached a > patch with an initial
[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks
[ https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325928#comment-16325928 ] Steven Rand commented on YARN-7655: --- Thanks [~yufeigu] for taking a look. The cluster sizes and nodes should be pretty reasonable -- for the three clusters I have in mind, the nodes are AWS ec2 instances with around 120 GB of RAM and around 20 vcores. The clusters range in size from double-digits to low triple-digits. That said, there is some configuration in place at these clusters which could explain high rates of AM preemption. Specifically: * The default max AM share is set to -1. Unfortunately the max AM share feature, while totally reasonable as far as I can tell, was causing a good deal of confusion when apps would fail to start for no apparently reason upon hitting the limit, and we disabled it in the hope that having one less variable would make the scheduler's behavior easier to understand. * The default fair share preemption threshold is set to 1.0. This was also an attempt to reduce confusion, as failure to preempt while below fair share (but above fair share * the threshold) was commonly misinterpreted as a bug. * The preemption timeouts for fair share and min share are also non-default -- they're set to one second each. Possibly the configuration overrides, along with access patterns that include apps frequently starting up or increasing their demand via Spark's dynamic allocation feature, are the issue here, in which case we don't need to pursue this JIRA further. Data on whether or not other YARN deployments experience this issue would be useful, though not easy to come by, as I had to add custom logging to identify NODE_LOCAL requests as the cause of most AM preemptions at these clusters. > avoid AM preemption caused by RRs for specific nodes or racks > - > > Key: YARN-7655 > URL: https://issues.apache.org/jira/browse/YARN-7655 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.0.0 >Reporter: Steven Rand >Assignee: Steven Rand >Priority: Major > Attachments: YARN-7655-001.patch > > > We frequently see AM preemptions when > {{starvedApp.getStarvedResourceRequests()}} in > {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs > that request containers on a specific node. Since this causes us to only > consider one node to preempt containers on, the really good work that was > done in YARN-5830 doesn't save us from AM preemption. Even though there might > be multiple nodes on which we could preempt enough non-AM containers to > satisfy the app's starvation, we often wind up preempting one or more AM > containers on the single node that we're considering. > A proposed solution is that if we're going to preempt one or more AM > containers for an RR that specifies a node or rack, then we should instead > expand the search space to consider all nodes. That way we take advantage of > YARN-5830, and only preempt AMs if there's no alternative. I've attached a > patch with an initial implementation of this. We've been running it on a few > clusters, and have seen AM preemptions drop from double-digit occurrences on > many days to zero. > Of course, the tradeoff is some loss of locality, since the starved app is > less likely to be allocated resources at the most specific locality level > that it asked for. My opinion is that this tradeoff is worth it, but > interested to hear what others think as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7746) Minor fixes to PlacementProcessor to support app priority
[ https://issues.apache.org/jira/browse/YARN-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325911#comment-16325911 ] Arun Suresh commented on YARN-7746: --- Updated the JIRA to track just the fix needed for the processor to respect application priority. The other bug fix will be included in YARN-6619 since it needed for end2end functionality to work and this JIRA can be tackled independently > Minor fixes to PlacementProcessor to support app priority > - > > Key: YARN-7746 > URL: https://issues.apache.org/jira/browse/YARN-7746 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > > The Threadpools used in the Processor should be modified to take a priority > blocking queue that respects application priority. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7746) Minor fixes to PlacementProcessor to support app priority
[ https://issues.apache.org/jira/browse/YARN-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7746: -- Summary: Minor fixes to PlacementProcessor to support app priority (was: Minor bug fixes to PlacementConstraintUtils and PlacementProcessor to support app priority) > Minor fixes to PlacementProcessor to support app priority > - > > Key: YARN-7746 > URL: https://issues.apache.org/jira/browse/YARN-7746 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > > JIRA opened to track 2 minor fixes. > The PlacementConstraintsUtil does a scope check using object equality and not > string equality, which causes some tests to pass, but it really fails in an > actual deployment. > The Threadpools used in the Processor should be modified to take a priority > blocking queue that respects application priority. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7746) Minor fixes to PlacementProcessor to support app priority
[ https://issues.apache.org/jira/browse/YARN-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-7746: -- Description: The Threadpools used in the Processor should be modified to take a priority blocking queue that respects application priority. (was: JIRA opened to track 2 minor fixes. The PlacementConstraintsUtil does a scope check using object equality and not string equality, which causes some tests to pass, but it really fails in an actual deployment. The Threadpools used in the Processor should be modified to take a priority blocking queue that respects application priority.) > Minor fixes to PlacementProcessor to support app priority > - > > Key: YARN-7746 > URL: https://issues.apache.org/jira/browse/YARN-7746 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > > The Threadpools used in the Processor should be modified to take a priority > blocking queue that respects application priority. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325891#comment-16325891 ] Arun Suresh edited comment on YARN-6619 at 1/15/18 5:00 AM: Attaching initial patch. * Modified the {{TestAMRMClient}} testcase to refactor out some common code to a separate class. * The test uses the asyncAMRMClient, since that way both the async client and the normal client will be tested, since the async client uses the normal client internally. * The general flow is, if the {{PlacementProcessor}} is enabled, the client has to call {{addPlacecmentConstraint}} before registering with the RM. The client can then use {{addSchedulingRequests(Collection)}} to add a collection of scheduling requests. All scheduling requests in the collection are guaranteed to be sent in the same allocate call. * The patch also has some minor bug fixes to ensure end 2 end, things work fine. was (Author: asuresh): Attaching initial patch. * Modified the {{TestAMRMClient}} testcase to refactor out some common code to a separate class. * The test uses the asyncAMRMClient, since that way both the async client and the normal client will be tested, since the async client uses the normal client internally. * The general flow is, if the {{PlacementProcessor}} is enabled, the client has to call {{addPlacecmentConstraint}} before registering with the RM. The client can then use {{addSchedulingRequests}} to add a collection of scheduling requests. All scheduling requests the one call to the above method are guaranteed to be sent in the same allocate call. * The patch also has some minor bug fixes to ensure end 2 end, things work fine. > AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest > objects > > > Key: YARN-6619 > URL: https://issues.apache.org/jira/browse/YARN-6619 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > Attachments: YARN-6619-YARN-6592.001.patch > > > Opening this JIRA to track changes needed in the AMRMClient to incorporate > the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects
[ https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6619: -- Attachment: YARN-6619-YARN-6592.001.patch > AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest > objects > > > Key: YARN-6619 > URL: https://issues.apache.org/jira/browse/YARN-6619 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Major > Attachments: YARN-6619-YARN-6592.001.patch > > > Opening this JIRA to track changes needed in the AMRMClient to incorporate > the PlacementConstraint and SchedulingRequest objects -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section
[ https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325674#comment-16325674 ] Szilard Nemeth edited comment on YARN-7451 at 1/14/18 7:54 PM: --- New patch contains reflection calls to ServiceFinder, so we are not having license issues on the original Jersey ServiceFinder anymore. I looked around and found some questions and issues about migrating to Jersey 2 on stackoverflow, so I suppose it's not a trivial thing to implement. Anyway, I will create a separate task for the upgrade because for the long term it would be the way to go. UPDATE: the yetus test result below seems unrelated to my change. was (Author: snemeth): New patch contains reflection calls to ServiceFinder, so we are not having license issues on the original Jersey ServiceFinder anymore. I looked around and found some questions and issues about migrating to Jersey 2 on stackoverflow, so I suppose it's not a trivial thing to implement. Anyway, I will create a separate task for the upgrade because for the long term it would be the way to go. > Resources Types should be visible in the Cluster Apps API "resourceRequests" > section > > > Key: YARN-7451 > URL: https://issues.apache.org/jira/browse/YARN-7451 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, restapi >Affects Versions: 3.0.0 >Reporter: Grant Sohn >Assignee: Szilard Nemeth > Attachments: YARN-7451.001.patch, YARN-7451.002.patch, > YARN-7451.003.patch, YARN-7451.004.patch, YARN-7451.005.patch, > YARN-7451.006.patch, YARN-7451.007.patch, > YARN-7451__Expose_custom_resource_types_on_RM_scheduler_API_as_flattened_map01_02.patch > > > When running jobs that request resource types the RM Cluster Apps API should > include this in the "resourceRequests" object. > Additionally, when calling the RM scheduler API it returns: > {noformat} > "childQueues": { > "queue": [ > { > "allocatedContainers": 101, > "amMaxResources": { > "memory": 320390, > "vCores": 192 > }, > "amUsedResources": { > "memory": 1024, > "vCores": 1 > }, > "clusterResources": { > "memory": 640779, > "vCores": 384 > }, > "demandResources": { > "memory": 103424, > "vCores": 101 > }, > "fairResources": { > "memory": 640779, > "vCores": 384 > }, > "maxApps": 2147483647, > "maxResources": { > "memory": 640779, > "vCores": 384 > }, > "minResources": { > "memory": 0, > "vCores": 0 > }, > "numActiveApps": 1, > "numPendingApps": 0, > "preemptable": true, > "queueName": "root.users.systest", > "reservedContainers": 0, > "reservedResources": { > "memory": 0, > "vCores": 0 > }, > "schedulingPolicy": "fair", > "steadyFairResources": { > "memory": 320390, > "vCores": 192 > }, > "type": "fairSchedulerLeafQueueInfo", > "usedResources": { >
[jira] [Commented] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section
[ https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325733#comment-16325733 ] genericqa commented on YARN-7451: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 46s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 34 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-client-modules/hadoop-client-minicluster hadoop-client-modules/hadoop-client-check-test-invariants {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 36s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 57s{color} | {color:orange} root: The patch generated 31 new + 156 unchanged - 19 fixed = 187 total (was 175) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 6s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-client-modules/hadoop-client-minicluster hadoop-client-modules/hadoop-client-check-test-invariants {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 32s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 16s{color} | {color:green} hadoop-project in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 35s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s{color} | {color:green} hadoop-client-minicluster in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 16s{color} | {color:green}
[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 beta release
[ https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325683#comment-16325683 ] Ted Yu commented on YARN-7346: -- New RC for hbase 2 beta1 has been posted. FYI > Fix compilation errors against hbase2 beta release > -- > > Key: YARN-7346 > URL: https://issues.apache.org/jira/browse/YARN-7346 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Vrushali C > Attachments: YARN-7346.00.patch, YARN-7346.01.patch, > YARN-7346.prelim1.patch, YARN-7346.prelim2.patch, YARN-7581.prelim.patch > > > When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, > I got the following errors: > https://pastebin.com/Ms4jYEVB > This issue is to fix the compilation errors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section
[ https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325674#comment-16325674 ] Szilard Nemeth edited comment on YARN-7451 at 1/14/18 4:58 PM: --- New patch contains reflection calls to ServiceFinder, so we are not having license issues on the original Jersey ServiceFinder anymore. I looked around and found some questions and issues about migrating to Jersey 2 on stackoverflow, so I suppose it's not a trivial thing to implement. Anyway, I will create a separate task for the upgrade because for the long term it would be the way to go. was (Author: snemeth): New patch contains reflection calls to ServiceFinder, so we are not having license issues on the original Jersey ServiceFinder anymore. > Resources Types should be visible in the Cluster Apps API "resourceRequests" > section > > > Key: YARN-7451 > URL: https://issues.apache.org/jira/browse/YARN-7451 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, restapi >Affects Versions: 3.0.0 >Reporter: Grant Sohn >Assignee: Szilard Nemeth > Attachments: YARN-7451.001.patch, YARN-7451.002.patch, > YARN-7451.003.patch, YARN-7451.004.patch, YARN-7451.005.patch, > YARN-7451.006.patch, YARN-7451.007.patch, > YARN-7451__Expose_custom_resource_types_on_RM_scheduler_API_as_flattened_map01_02.patch > > > When running jobs that request resource types the RM Cluster Apps API should > include this in the "resourceRequests" object. > Additionally, when calling the RM scheduler API it returns: > {noformat} > "childQueues": { > "queue": [ > { > "allocatedContainers": 101, > "amMaxResources": { > "memory": 320390, > "vCores": 192 > }, > "amUsedResources": { > "memory": 1024, > "vCores": 1 > }, > "clusterResources": { > "memory": 640779, > "vCores": 384 > }, > "demandResources": { > "memory": 103424, > "vCores": 101 > }, > "fairResources": { > "memory": 640779, > "vCores": 384 > }, > "maxApps": 2147483647, > "maxResources": { > "memory": 640779, > "vCores": 384 > }, > "minResources": { > "memory": 0, > "vCores": 0 > }, > "numActiveApps": 1, > "numPendingApps": 0, > "preemptable": true, > "queueName": "root.users.systest", > "reservedContainers": 0, > "reservedResources": { > "memory": 0, > "vCores": 0 > }, > "schedulingPolicy": "fair", > "steadyFairResources": { > "memory": 320390, > "vCores": 192 > }, > "type": "fairSchedulerLeafQueueInfo", > "usedResources": { > "memory": 103424, > "vCores": 101 > } > } > ] > {noformat} > However, the web UI shows resource types usage. -- This message was sent
[jira] [Updated] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section
[ https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-7451: - Attachment: YARN-7451.007.patch New patch contains reflection calls to ServiceFinder, so we are not having license issues on the original Jersey ServiceFinder anymore. > Resources Types should be visible in the Cluster Apps API "resourceRequests" > section > > > Key: YARN-7451 > URL: https://issues.apache.org/jira/browse/YARN-7451 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, restapi >Affects Versions: 3.0.0 >Reporter: Grant Sohn >Assignee: Szilard Nemeth > Attachments: YARN-7451.001.patch, YARN-7451.002.patch, > YARN-7451.003.patch, YARN-7451.004.patch, YARN-7451.005.patch, > YARN-7451.006.patch, YARN-7451.007.patch, > YARN-7451__Expose_custom_resource_types_on_RM_scheduler_API_as_flattened_map01_02.patch > > > When running jobs that request resource types the RM Cluster Apps API should > include this in the "resourceRequests" object. > Additionally, when calling the RM scheduler API it returns: > {noformat} > "childQueues": { > "queue": [ > { > "allocatedContainers": 101, > "amMaxResources": { > "memory": 320390, > "vCores": 192 > }, > "amUsedResources": { > "memory": 1024, > "vCores": 1 > }, > "clusterResources": { > "memory": 640779, > "vCores": 384 > }, > "demandResources": { > "memory": 103424, > "vCores": 101 > }, > "fairResources": { > "memory": 640779, > "vCores": 384 > }, > "maxApps": 2147483647, > "maxResources": { > "memory": 640779, > "vCores": 384 > }, > "minResources": { > "memory": 0, > "vCores": 0 > }, > "numActiveApps": 1, > "numPendingApps": 0, > "preemptable": true, > "queueName": "root.users.systest", > "reservedContainers": 0, > "reservedResources": { > "memory": 0, > "vCores": 0 > }, > "schedulingPolicy": "fair", > "steadyFairResources": { > "memory": 320390, > "vCores": 192 > }, > "type": "fairSchedulerLeafQueueInfo", > "usedResources": { > "memory": 103424, > "vCores": 101 > } > } > ] > {noformat} > However, the web UI shows resource types usage. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed
[ https://issues.apache.org/jira/browse/YARN-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325654#comment-16325654 ] Haibo Chen commented on YARN-7748: -- I suspect the failure has something to with more than one AppAttemptRemovedSchedulerEvent are generated in the test. There are two lines of log, {code} 2018-01-14 09:45:16,228 INFO [main] capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(973)) - Application removed - appId: application_1515923115995_0001 user: user queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0 {code} and {code} 2018-01-14 09:45:16,229 INFO [AsyncDispatcher event handler] capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(973)) - Application removed - appId: application_1515923115995_0001 user: user queue: default #user-pending-applications: -1 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0 {code} > TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted > failed > > > Key: YARN-7748 > URL: https://issues.apache.org/jira/browse/YARN-7748 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.0.0 >Reporter: Haibo Chen > > TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted > Failing for the past 1 build (Since Failed#19244 ) > Took 0.4 sec. > *Error Message* > expected null, but > was:> *Stacktrace* > {code} > java.lang.AssertionError: expected null, but > was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotNull(Assert.java:664) > at org.junit.Assert.assertNull(Assert.java:646) > at org.junit.Assert.assertNull(Assert.java:656) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted(TestContainerResizing.java:826) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed
[ https://issues.apache.org/jira/browse/YARN-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325652#comment-16325652 ] Haibo Chen commented on YARN-7748: -- {code} 2018-01-14 09:45:15,849 DEBUG [main] service.AbstractService (AbstractService.java:enterState(453)) - Service: org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager entered state INITED 2018-01-14 09:45:15,865 INFO [main] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2656)) - resource-types.xml not found 2018-01-14 09:45:15,865 INFO [main] resource.ResourceUtils (ResourceUtils.java:addResourcesFileToConf(395)) - Unable to find 'resource-types.xml'. 2018-01-14 09:45:15,865 DEBUG [main] resource.ResourceUtils (ResourceUtils.java:addMandatoryResources(127)) - Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE 2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils (ResourceUtils.java:addMandatoryResources(137)) - Adding resource type - name = vcores, units = , type = COUNTABLE 2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils (ResourceUtils.java:getAllocation(177)) - Mandatory Resource 'yarn.resource-types.memory-mb.minimum-allocation' is not configured in resource-types config file. Setting allocation specified using 'yarn.scheduler.minimum-allocation-mb' 2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils (ResourceUtils.java:getAllocation(177)) - Mandatory Resource 'yarn.resource-types.memory-mb.maximum-allocation' is not configured in resource-types config file. Setting allocation specified using 'yarn.scheduler.maximum-allocation-mb' 2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils (ResourceUtils.java:getAllocation(177)) - Mandatory Resource 'yarn.resource-types.vcores.minimum-allocation' is not configured in resource-types config file. Setting allocation specified using 'yarn.scheduler.minimum-allocation-vcores' 2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils (ResourceUtils.java:getAllocation(177)) - Mandatory Resource 'yarn.resource-types.vcores.maximum-allocation' is not configured in resource-types config file. Setting allocation specified using 'yarn.scheduler.maximum-allocation-vcores' 2018-01-14 09:45:15,866 DEBUG [main] service.AbstractService (AbstractService.java:enterState(453)) - Service: ResourceManager entered state INITED 2018-01-14 09:45:15,867 INFO [main] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2659)) - found resource core-site.xml at file:/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml 2018-01-14 09:45:15,875 DEBUG [main] security.JniBasedUnixGroupsMappingWithFallback (JniBasedUnixGroupsMappingWithFallback.java:(45)) - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping 2018-01-14 09:45:15,876 DEBUG [main] security.Groups (Groups.java:(150)) - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=30; warningDeltaMs=5000 2018-01-14 09:45:15,876 INFO [main] security.Groups (Groups.java:refresh(401)) - clearing userToGroupsMap cache 2018-01-14 09:45:15,878 INFO [main] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2659)) - found resource yarn-site.xml at file:/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml 2018-01-14 09:45:15,887 INFO [main] event.AsyncDispatcher (AsyncDispatcher.java:register(223)) - Registering class org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher 2018-01-14 09:45:15,887 DEBUG [main] service.CompositeService (CompositeService.java:addService(74)) - Adding service Dispatcher 2018-01-14 09:45:15,887 DEBUG [main] service.CompositeService (CompositeService.java:addService(74)) - Adding service org.apache.hadoop.yarn.server.resourcemanager.AdminService 2018-01-14 09:45:15,888 DEBUG [main] service.AbstractService (AbstractService.java:enterState(453)) - Service: RMActiveServices entered state INITED 2018-01-14 09:45:15,888 INFO [main] security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:(75)) - NMTokenKeyRollingInterval: 8640ms and NMTokenKeyActivationDelay: 90ms 2018-01-14 09:45:15,889 INFO [main] security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:(79)) - ContainerTokenKeyRollingInterval: 8640ms and ContainerTokenKeyActivationDelay: 90ms 2018-01-14 09:45:15,890 INFO [main] security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:(94)) - AMRMTokenKeyRollingInterval: 8640ms and AMRMTokenKeyActivationDelay: 90 ms 2018-01-14 09:45:15,891 DEBUG [main] service.CompositeService (CompositeService.java:addService(74)) - Adding service
[jira] [Created] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed
Haibo Chen created YARN-7748: Summary: TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed Key: YARN-7748 URL: https://issues.apache.org/jira/browse/YARN-7748 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0 Reporter: Haibo Chen TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted Failing for the past 1 build (Since Failed#19244 ) Took 0.4 sec. *Error Message* expected null, but was:*Stacktrace* {code} java.lang.AssertionError: expected null, but was: at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotNull(Assert.java:664) at org.junit.Assert.assertNull(Assert.java:646) at org.junit.Assert.assertNull(Assert.java:656) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted(TestContainerResizing.java:826) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5473) Expose per-application over-allocation info in the Resource Manager
[ https://issues.apache.org/jira/browse/YARN-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325649#comment-16325649 ] Haibo Chen commented on YARN-5473: -- Updated the patch to address the new findbug issues as well as the some checkstyle issues. The unit test failure and license problem is unrelated. > Expose per-application over-allocation info in the Resource Manager > --- > > Key: YARN-5473 > URL: https://issues.apache.org/jira/browse/YARN-5473 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Haibo Chen > Attachments: YARN-5473-YARN-1011.00.patch, > YARN-5473-YARN-1011.01.patch, YARN-5473-YARN-1011.prelim.patch > > > When enabling over-allocation of nodes, the resources in the cluster change. > We need to surface this information for users to understand these changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5473) Expose per-application over-allocation info in the Resource Manager
[ https://issues.apache.org/jira/browse/YARN-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-5473: - Attachment: YARN-5473-YARN-1011.01.patch > Expose per-application over-allocation info in the Resource Manager > --- > > Key: YARN-5473 > URL: https://issues.apache.org/jira/browse/YARN-5473 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Íñigo Goiri >Assignee: Haibo Chen > Attachments: YARN-5473-YARN-1011.00.patch, > YARN-5473-YARN-1011.01.patch, YARN-5473-YARN-1011.prelim.patch > > > When enabling over-allocation of nodes, the resources in the cluster change. > We need to surface this information for users to understand these changes. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5473) Expose per-application over-allocation info in the Resource Manager
[ https://issues.apache.org/jira/browse/YARN-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325540#comment-16325540 ] genericqa commented on YARN-5473: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 23 new or modified test files. {color} | || || || || {color:brown} YARN-1011 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 5m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 12s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 40s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 21s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 50s{color} | {color:green} YARN-1011 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 59s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-1011 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 41s{color} | {color:green} YARN-1011 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 14s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 23s{color} | {color:orange} root: The patch generated 10 new + 1770 unchanged - 20 fixed = 1780 total (was 1790) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 26s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 47s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 41s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 23m 50s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 29s{color} | {color:green} hadoop-yarn-server-router in the
[jira] [Updated] (YARN-7563) Invalid event: FINISH_APPLICATION at NEW may make some application level resource be not cleaned
[ https://issues.apache.org/jira/browse/YARN-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-7563: Affects Version/s: 2.8.0 > Invalid event: FINISH_APPLICATION at NEW may make some application level > resource be not cleaned > - > > Key: YARN-7563 > URL: https://issues.apache.org/jira/browse/YARN-7563 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.8.0, 3.0.0-beta1 >Reporter: lujie >Assignee: lujie > Attachments: YARN-7563.png, YARN-7563.txt > > > I send kill command to application, nodemanager log shows: > {code:java} > 2017-11-25 19:18:48,126 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > couldn't find container container_1511608703018_0001_01_01 while > processing FINISH_CONTAINERS event > 2017-11-25 19:18:48,146 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: > FINISH_APPLICATION at NEW > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:627) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:75) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1508) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1501) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > 2017-11-25 19:18:48,151 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: > Application application_1511608703018_0001 transitioned from NEW to INITING > {code} > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org