date:20180114

[jira] [Commented] (YARN-7620) Allow node partition filters on Queues page of new YARN UI

2018-01-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325972#comment-16325972
 ] 

ASF GitHub Bot commented on YARN-7620:
--

Github user skmvasu commented on the issue:

https://github.com/apache/hadoop/pull/310
  
This is closed already


> Allow node partition filters on Queues page of new YARN UI
> --
>
> Key: YARN-7620
> URL: https://issues.apache.org/jira/browse/YARN-7620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7620.001.patch, YARN-7620.002.patch, 
> YARN-7620.003.patch, YARN-7620.004.patch
>
>
> Allow users their queues based on node labels



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7620) Allow node partition filters on Queues page of new YARN UI

2018-01-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325973#comment-16325973
 ] 

ASF GitHub Bot commented on YARN-7620:
--

Github user skmvasu closed the pull request at:

https://github.com/apache/hadoop/pull/310


> Allow node partition filters on Queues page of new YARN UI
> --
>
> Key: YARN-7620
> URL: https://issues.apache.org/jira/browse/YARN-7620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Vasudevan Skm
>Assignee: Vasudevan Skm
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7620.001.patch, YARN-7620.002.patch, 
> YARN-7620.003.patch, YARN-7620.004.patch
>
>
> Allow users their queues based on node labels



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

2018-01-14 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325961#comment-16325961
 ] 

genericqa commented on YARN-6619:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-6592 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
18s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
20s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} YARN-6592 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} YARN-6592 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
19s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 58s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 28 new + 205 unchanged - 8 fixed = 233 total (was 213) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m  
0s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 33s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.client.api.impl.TestOpportunisticContainerAllocationE2E |
|   | hadoop.yarn.client.api.impl.TestAMRMClientOnRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-6619 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12906060/YARN-6619-YARN-6592.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 23c8f2ebc664 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Commented] (YARN-7479) TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk

2018-01-14 Thread Akira Ajisaka (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325953#comment-16325953
 ] 

Akira Ajisaka commented on YARN-7479:
-

Thanks [~rkanter] for the review. Changed the interval to 500ms.

> TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk
> 
>
> Key: YARN-7479
> URL: https://issues.apache.org/jira/browse/YARN-7479
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Botong Huang
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: YARN-7479.001.patch, YARN-7479.002.patch, 
> YARN-7479.003.patch, YARN-7479.004.patch
>
>
> Was waiting for container_1_0001_01_00 to get to state COMPLETE but was 
> in state RUNNING after the timeout
> java.lang.AssertionError: Was waiting for container_1_0001_01_00 to get 
> to state COMPLETE but was in state RUNNING after the timeout
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:431)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:171)
> Pasting some exception message during test run here: 
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  Given NMToken for application : appattempt_1_0001_01 seems to have been 
> generated illegally.
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  Given NMToken for application : appattempt_1_0001_01 is not valid for 
> current node manager.expected : localhost:46649 found : InvalidHost:1234
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7479) TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk

2018-01-14 Thread Akira Ajisaka (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7479:

Attachment: YARN-7479.004.patch

> TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk
> 
>
> Key: YARN-7479
> URL: https://issues.apache.org/jira/browse/YARN-7479
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Botong Huang
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: YARN-7479.001.patch, YARN-7479.002.patch, 
> YARN-7479.003.patch, YARN-7479.004.patch
>
>
> Was waiting for container_1_0001_01_00 to get to state COMPLETE but was 
> in state RUNNING after the timeout
> java.lang.AssertionError: Was waiting for container_1_0001_01_00 to get 
> to state COMPLETE but was in state RUNNING after the timeout
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:431)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:171)
> Pasting some exception message during test run here: 
> org.apache.hadoop.security.AccessControlException: SIMPLE authentication is 
> not enabled.  Available:[TOKEN]
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  Given NMToken for application : appattempt_1_0001_01 seems to have been 
> generated illegally.
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  Given NMToken for application : appattempt_1_0001_01 is not valid for 
> current node manager.expected : localhost:46649 found : InvalidHost:1234
>   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1491)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7750) Render time in the users timezone

2018-01-14 Thread Vasudevan Skm (JIRA)

Vasudevan Skm created YARN-7750:
---

 Summary: Render time in the users timezone
 Key: YARN-7750
 URL: https://issues.apache.org/jira/browse/YARN-7750
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
 Environment: Render time in the users timezone/ a predefined TZ

 
Reporter: Vasudevan Skm
Assignee: Vasudevan Skm






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7749) [UI2] GPU information tab in left hand side disappears when we click other tabs below

2018-01-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325934#comment-16325934
 ] 

ASF GitHub Bot commented on YARN-7749:
--

Github user skmvasu commented on the issue:

https://github.com/apache/hadoop/pull/327
  
@sunilgovind 


> [UI2] GPU information tab in left hand side disappears when we click other 
> tabs below
> -
>
> Key: YARN-7749
> URL: https://issues.apache.org/jira/browse/YARN-7749
> Project: Hadoop YARN
>  Issue Type: Bug
> Environment: {color:#33} {color}
>Reporter: Sumana Sathish
>Assignee: Vasudevan Skm
>Priority: Major
>
> {color:#33}'GPU Information' tab on the left side of the Node Manager 
> Page disappears when we click 'List of applications' or 'List of Containers' 
> tab.{color}
> {color:#33}Once we click on 'Node Information' tab, it reappears{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-7749) [UI2] GPU information tab in left hand side disappears when we click other tabs below

2018-01-14 Thread Rohith Sharma K S (JIRA)

Rohith Sharma K S created YARN-7749:
---

 Summary: [UI2] GPU information tab in left hand side disappears 
when we click other tabs below
 Key: YARN-7749
 URL: https://issues.apache.org/jira/browse/YARN-7749
 Project: Hadoop YARN
  Issue Type: Bug
 Environment: {color:#33} {color}
Reporter: Sumana Sathish
Assignee: Vasudevan Skm


{color:#33}'GPU Information' tab on the left side of the Node Manager Page 
disappears when we click 'List of applications' or 'List of Containers' 
tab.{color}
{color:#33}Once we click on 'Node Information' tab, it reappears{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks

2018-01-14 Thread Steven Rand (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325928#comment-16325928
 ] 

Steven Rand edited comment on YARN-7655 at 1/15/18 6:49 AM:


Thanks [~yufeigu] for taking a look. The cluster sizes and nodes should be 
pretty reasonable – for the three clusters I have in mind, the nodes are AWS 
ec2 instances with around 120 GB of RAM and around 20 vcores. The clusters 
range in size from double-digits to low triple-digits.

That said, there is some configuration in place at these clusters which could 
explain high rates of AM preemption. Specifically:
 * The default max AM share is set to -1. Unfortunately the max AM share 
feature, while totally reasonable as far as I can tell, was causing a good deal 
of confusion when apps would fail to start for no apparent reason upon hitting 
the limit, and we disabled it in the hope that having one less variable would 
make the scheduler's behavior easier to understand.
 * The default fair share preemption threshold is set to 1.0. This was also an 
attempt to reduce confusion, as failure to preempt while below fair share (but 
above fair share * the threshold) was commonly misinterpreted as a bug.
 * The preemption timeouts for fair share and min share are also non-default – 
they're set to one second each.

Possibly the configuration overrides, along with access patterns that include 
apps frequently starting up or increasing their demand via Spark's dynamic 
allocation feature, are the issue here, in which case we don't need to pursue 
this JIRA further. Data on whether or not other YARN deployments experience 
this issue would be useful, though not easy to come by, as I had to add custom 
logging to identify NODE_LOCAL requests as the cause of most AM preemptions at 
these clusters.


was (Author: steven rand):
Thanks [~yufeigu] for taking a look. The cluster sizes and nodes should be 
pretty reasonable -- for the three clusters I have in mind, the nodes are AWS 
ec2 instances with around 120 GB of RAM and around 20 vcores. The clusters 
range in size from double-digits to low triple-digits.

That said, there is some configuration in place at these clusters which could 
explain high rates of AM preemption. Specifically:

* The default max AM share is set to -1. Unfortunately the max AM share 
feature, while totally reasonable as far as I can tell, was causing a good deal 
of confusion when apps would fail to start for no apparently reason upon 
hitting the limit, and we disabled it in the hope that having one less variable 
would make the scheduler's behavior easier to understand.
* The default fair share preemption threshold is set to 1.0. This was also an 
attempt to reduce confusion, as failure to preempt while below fair share (but 
above fair share * the threshold) was commonly misinterpreted as a bug.
* The preemption timeouts for fair share and min share are also non-default -- 
they're set to one second each.

Possibly the configuration overrides, along with access patterns that include 
apps frequently starting up or increasing their demand via Spark's dynamic 
allocation feature, are the issue here, in which case we don't need to pursue 
this JIRA further. Data on whether or not other YARN deployments experience 
this issue would be useful, though not easy to come by, as I had to add custom 
logging to identify NODE_LOCAL requests as the cause of most AM preemptions at 
these clusters.

> avoid AM preemption caused by RRs for specific nodes or racks
> -
>
> Key: YARN-7655
> URL: https://issues.apache.org/jira/browse/YARN-7655
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: YARN-7655-001.patch
>
>
> We frequently see AM preemptions when 
> {{starvedApp.getStarvedResourceRequests()}} in 
> {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs 
> that request containers on a specific node. Since this causes us to only 
> consider one node to preempt containers on, the really good work that was 
> done in YARN-5830 doesn't save us from AM preemption. Even though there might 
> be multiple nodes on which we could preempt enough non-AM containers to 
> satisfy the app's starvation, we often wind up preempting one or more AM 
> containers on the single node that we're considering.
> A proposed solution is that if we're going to preempt one or more AM 
> containers for an RR that specifies a node or rack, then we should instead 
> expand the search space to consider all nodes. That way we take advantage of 
> YARN-5830, and only preempt AMs if there's no alternative. I've attached a 
> patch with an initial

[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks

2018-01-14 Thread Steven Rand (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325928#comment-16325928
 ] 

Steven Rand commented on YARN-7655:
---

Thanks [~yufeigu] for taking a look. The cluster sizes and nodes should be 
pretty reasonable -- for the three clusters I have in mind, the nodes are AWS 
ec2 instances with around 120 GB of RAM and around 20 vcores. The clusters 
range in size from double-digits to low triple-digits.

That said, there is some configuration in place at these clusters which could 
explain high rates of AM preemption. Specifically:

* The default max AM share is set to -1. Unfortunately the max AM share 
feature, while totally reasonable as far as I can tell, was causing a good deal 
of confusion when apps would fail to start for no apparently reason upon 
hitting the limit, and we disabled it in the hope that having one less variable 
would make the scheduler's behavior easier to understand.
* The default fair share preemption threshold is set to 1.0. This was also an 
attempt to reduce confusion, as failure to preempt while below fair share (but 
above fair share * the threshold) was commonly misinterpreted as a bug.
* The preemption timeouts for fair share and min share are also non-default -- 
they're set to one second each.

Possibly the configuration overrides, along with access patterns that include 
apps frequently starting up or increasing their demand via Spark's dynamic 
allocation feature, are the issue here, in which case we don't need to pursue 
this JIRA further. Data on whether or not other YARN deployments experience 
this issue would be useful, though not easy to come by, as I had to add custom 
logging to identify NODE_LOCAL requests as the cause of most AM preemptions at 
these clusters.

> avoid AM preemption caused by RRs for specific nodes or racks
> -
>
> Key: YARN-7655
> URL: https://issues.apache.org/jira/browse/YARN-7655
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: YARN-7655-001.patch
>
>
> We frequently see AM preemptions when 
> {{starvedApp.getStarvedResourceRequests()}} in 
> {{FSPreemptionThread#identifyContainersToPreempt}} includes one or more RRs 
> that request containers on a specific node. Since this causes us to only 
> consider one node to preempt containers on, the really good work that was 
> done in YARN-5830 doesn't save us from AM preemption. Even though there might 
> be multiple nodes on which we could preempt enough non-AM containers to 
> satisfy the app's starvation, we often wind up preempting one or more AM 
> containers on the single node that we're considering.
> A proposed solution is that if we're going to preempt one or more AM 
> containers for an RR that specifies a node or rack, then we should instead 
> expand the search space to consider all nodes. That way we take advantage of 
> YARN-5830, and only preempt AMs if there's no alternative. I've attached a 
> patch with an initial implementation of this. We've been running it on a few 
> clusters, and have seen AM preemptions drop from double-digit occurrences on 
> many days to zero.
> Of course, the tradeoff is some loss of locality, since the starved app is 
> less likely to be allocated resources at the most specific locality level 
> that it asked for. My opinion is that this tradeoff is worth it, but 
> interested to hear what others think as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7746) Minor fixes to PlacementProcessor to support app priority

2018-01-14 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325911#comment-16325911
 ] 

Arun Suresh commented on YARN-7746:
---

Updated the JIRA to track just the fix needed for the processor to respect 
application priority. The other bug fix will be included in YARN-6619 since it 
needed for end2end functionality to work and this JIRA can be tackled 
independently

> Minor fixes to PlacementProcessor to support app priority
> -
>
> Key: YARN-7746
> URL: https://issues.apache.org/jira/browse/YARN-7746
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> The Threadpools used in the Processor should be modified to take a priority 
> blocking queue that respects application priority.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7746) Minor fixes to PlacementProcessor to support app priority

2018-01-14 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7746:
--
Summary: Minor fixes to PlacementProcessor to support app priority  (was: 
Minor bug fixes to PlacementConstraintUtils and PlacementProcessor to support 
app priority)

> Minor fixes to PlacementProcessor to support app priority
> -
>
> Key: YARN-7746
> URL: https://issues.apache.org/jira/browse/YARN-7746
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> JIRA opened to track 2 minor fixes.
> The PlacementConstraintsUtil does a scope check using object equality and not 
> string equality, which causes some tests to pass, but it really fails in an 
> actual deployment.
> The Threadpools used in the Processor should be modified to take a priority 
> blocking queue that respects application priority.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7746) Minor fixes to PlacementProcessor to support app priority

2018-01-14 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7746:
--
Description: The Threadpools used in the Processor should be modified to 
take a priority blocking queue that respects application priority.  (was: JIRA 
opened to track 2 minor fixes.

The PlacementConstraintsUtil does a scope check using object equality and not 
string equality, which causes some tests to pass, but it really fails in an 
actual deployment.

The Threadpools used in the Processor should be modified to take a priority 
blocking queue that respects application priority.)

> Minor fixes to PlacementProcessor to support app priority
> -
>
> Key: YARN-7746
> URL: https://issues.apache.org/jira/browse/YARN-7746
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> The Threadpools used in the Processor should be modified to take a priority 
> blocking queue that respects application priority.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

2018-01-14 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325891#comment-16325891
 ] 

Arun Suresh edited comment on YARN-6619 at 1/15/18 5:00 AM:


Attaching initial patch.
 * Modified the {{TestAMRMClient}} testcase to refactor out some common code to 
a separate class.
 * The test uses the asyncAMRMClient, since that way both the async client and 
the normal client will be tested, since the async client uses the normal client 
internally.
* The general flow is, if the {{PlacementProcessor}} is enabled, the client has 
to call {{addPlacecmentConstraint}} before registering with the RM. The client 
can then use {{addSchedulingRequests(Collection)}} to add a 
collection of scheduling requests. All scheduling requests in the collection 
are guaranteed to be sent in the same allocate call.
* The patch also has some minor bug fixes to ensure end 2 end, things work fine.


was (Author: asuresh):
Attaching initial patch.
 * Modified the {{TestAMRMClient}} testcase to refactor out some common code to 
a separate class.
 * The test uses the asyncAMRMClient, since that way both the async client and 
the normal client will be tested, since the async client uses the normal client 
internally.
* The general flow is, if the {{PlacementProcessor}} is enabled, the client has 
to call {{addPlacecmentConstraint}} before registering with the RM. The client 
can then use {{addSchedulingRequests}} to add a collection of scheduling 
requests. All scheduling requests the one call to the above method are 
guaranteed to be sent in the same allocate call.
* The patch also has some minor bug fixes to ensure end 2 end, things work fine.

> AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest 
> objects
> 
>
> Key: YARN-6619
> URL: https://issues.apache.org/jira/browse/YARN-6619
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-6619-YARN-6592.001.patch
>
>
> Opening this JIRA to track changes needed in the AMRMClient to incorporate 
> the PlacementConstraint and SchedulingRequest objects



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

2018-01-14 Thread Arun Suresh (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-6619:
--
Attachment: YARN-6619-YARN-6592.001.patch

> AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest 
> objects
> 
>
> Key: YARN-6619
> URL: https://issues.apache.org/jira/browse/YARN-6619
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
> Attachments: YARN-6619-YARN-6592.001.patch
>
>
> Opening this JIRA to track changes needed in the AMRMClient to incorporate 
> the PlacementConstraint and SchedulingRequest objects



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

2018-01-14 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325674#comment-16325674
 ] 

Szilard Nemeth edited comment on YARN-7451 at 1/14/18 7:54 PM:
---

New patch contains reflection calls to ServiceFinder, so we are not having 
license issues on the original Jersey ServiceFinder anymore.
I looked around and found some questions and issues about migrating to Jersey 2 
on stackoverflow, so I suppose it's not a trivial thing to implement.
Anyway, I will create a separate task for the upgrade because for the long term 
it would be the way to go.
UPDATE: the yetus test result below seems unrelated to my change.


was (Author: snemeth):
New patch contains reflection calls to ServiceFinder, so we are not having 
license issues on the original Jersey ServiceFinder anymore.
I looked around and found some questions and issues about migrating to Jersey 2 
on stackoverflow, so I suppose it's not a trivial thing to implement.
Anyway, I will create a separate task for the upgrade because for the long term 
it would be the way to go.

> Resources Types should be visible in the Cluster Apps API "resourceRequests" 
> section
> 
>
> Key: YARN-7451
> URL: https://issues.apache.org/jira/browse/YARN-7451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, restapi
>Affects Versions: 3.0.0
>Reporter: Grant Sohn
>Assignee: Szilard Nemeth
> Attachments: YARN-7451.001.patch, YARN-7451.002.patch, 
> YARN-7451.003.patch, YARN-7451.004.patch, YARN-7451.005.patch, 
> YARN-7451.006.patch, YARN-7451.007.patch, 
> YARN-7451__Expose_custom_resource_types_on_RM_scheduler_API_as_flattened_map01_02.patch
>
>
> When running jobs that request resource types the RM Cluster Apps API should 
> include this in the "resourceRequests" object.
> Additionally, when calling the RM scheduler API it returns:
> {noformat}
>  "childQueues": {
> "queue": [
> {
> "allocatedContainers": 101,
> "amMaxResources": {
> "memory": 320390,
> "vCores": 192
> },
> "amUsedResources": {
> "memory": 1024,
> "vCores": 1
> },
> "clusterResources": {
> "memory": 640779,
> "vCores": 384
> },
> "demandResources": {
> "memory": 103424,
> "vCores": 101
> },
> "fairResources": {
> "memory": 640779,
> "vCores": 384
> },
> "maxApps": 2147483647,
> "maxResources": {
> "memory": 640779,
> "vCores": 384
> },
> "minResources": {
> "memory": 0,
> "vCores": 0
> },
> "numActiveApps": 1,
> "numPendingApps": 0,
> "preemptable": true,
> "queueName": "root.users.systest",
> "reservedContainers": 0,
> "reservedResources": {
> "memory": 0,
> "vCores": 0
> },
> "schedulingPolicy": "fair",
> "steadyFairResources": {
> "memory": 320390,
> "vCores": 192
> },
> "type": "fairSchedulerLeafQueueInfo",
> "usedResources": {
>

[jira] [Commented] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

2018-01-14 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325733#comment-16325733
 ] 

genericqa commented on YARN-7451:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 34 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-client-modules/hadoop-client-minicluster 
hadoop-client-modules/hadoop-client-check-test-invariants {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 57s{color} | {color:orange} root: The patch generated 31 new + 156 unchanged 
- 19 fixed = 187 total (was 175) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
6s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-client-modules/hadoop-client-minicluster 
hadoop-client-modules/hadoop-client-check-test-invariants {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
32s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 2 new + 4 unchanged - 0 fixed = 6 total (was 4) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
16s{color} | {color:green} hadoop-project in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 35s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-client-minicluster in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
16s{color} | {color:green}

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 beta release

2018-01-14 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325683#comment-16325683
 ] 

Ted Yu commented on YARN-7346:
--

New RC for hbase 2 beta1 has been posted.
FYI

> Fix compilation errors against hbase2 beta release
> --
>
> Key: YARN-7346
> URL: https://issues.apache.org/jira/browse/YARN-7346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Vrushali C
> Attachments: YARN-7346.00.patch, YARN-7346.01.patch, 
> YARN-7346.prelim1.patch, YARN-7346.prelim2.patch, YARN-7581.prelim.patch
>
>
> When compiling hadoop-yarn-server-timelineservice-hbase against 2.0.0-alpha3, 
> I got the following errors:
> https://pastebin.com/Ms4jYEVB
> This issue is to fix the compilation errors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

2018-01-14 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325674#comment-16325674
 ] 

Szilard Nemeth edited comment on YARN-7451 at 1/14/18 4:58 PM:
---

New patch contains reflection calls to ServiceFinder, so we are not having 
license issues on the original Jersey ServiceFinder anymore.
I looked around and found some questions and issues about migrating to Jersey 2 
on stackoverflow, so I suppose it's not a trivial thing to implement.
Anyway, I will create a separate task for the upgrade because for the long term 
it would be the way to go.


was (Author: snemeth):
New patch contains reflection calls to ServiceFinder, so we are not having 
license issues on the original Jersey ServiceFinder anymore.

> Resources Types should be visible in the Cluster Apps API "resourceRequests" 
> section
> 
>
> Key: YARN-7451
> URL: https://issues.apache.org/jira/browse/YARN-7451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, restapi
>Affects Versions: 3.0.0
>Reporter: Grant Sohn
>Assignee: Szilard Nemeth
> Attachments: YARN-7451.001.patch, YARN-7451.002.patch, 
> YARN-7451.003.patch, YARN-7451.004.patch, YARN-7451.005.patch, 
> YARN-7451.006.patch, YARN-7451.007.patch, 
> YARN-7451__Expose_custom_resource_types_on_RM_scheduler_API_as_flattened_map01_02.patch
>
>
> When running jobs that request resource types the RM Cluster Apps API should 
> include this in the "resourceRequests" object.
> Additionally, when calling the RM scheduler API it returns:
> {noformat}
>  "childQueues": {
> "queue": [
> {
> "allocatedContainers": 101,
> "amMaxResources": {
> "memory": 320390,
> "vCores": 192
> },
> "amUsedResources": {
> "memory": 1024,
> "vCores": 1
> },
> "clusterResources": {
> "memory": 640779,
> "vCores": 384
> },
> "demandResources": {
> "memory": 103424,
> "vCores": 101
> },
> "fairResources": {
> "memory": 640779,
> "vCores": 384
> },
> "maxApps": 2147483647,
> "maxResources": {
> "memory": 640779,
> "vCores": 384
> },
> "minResources": {
> "memory": 0,
> "vCores": 0
> },
> "numActiveApps": 1,
> "numPendingApps": 0,
> "preemptable": true,
> "queueName": "root.users.systest",
> "reservedContainers": 0,
> "reservedResources": {
> "memory": 0,
> "vCores": 0
> },
> "schedulingPolicy": "fair",
> "steadyFairResources": {
> "memory": 320390,
> "vCores": 192
> },
> "type": "fairSchedulerLeafQueueInfo",
> "usedResources": {
> "memory": 103424,
> "vCores": 101
> }
> }
> ]
> {noformat}
> However, the web UI shows resource types usage.



--
This message was sent

[jira] [Updated] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

2018-01-14 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-7451:
-
Attachment: YARN-7451.007.patch

New patch contains reflection calls to ServiceFinder, so we are not having 
license issues on the original Jersey ServiceFinder anymore.

> Resources Types should be visible in the Cluster Apps API "resourceRequests" 
> section
> 
>
> Key: YARN-7451
> URL: https://issues.apache.org/jira/browse/YARN-7451
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, restapi
>Affects Versions: 3.0.0
>Reporter: Grant Sohn
>Assignee: Szilard Nemeth
> Attachments: YARN-7451.001.patch, YARN-7451.002.patch, 
> YARN-7451.003.patch, YARN-7451.004.patch, YARN-7451.005.patch, 
> YARN-7451.006.patch, YARN-7451.007.patch, 
> YARN-7451__Expose_custom_resource_types_on_RM_scheduler_API_as_flattened_map01_02.patch
>
>
> When running jobs that request resource types the RM Cluster Apps API should 
> include this in the "resourceRequests" object.
> Additionally, when calling the RM scheduler API it returns:
> {noformat}
>  "childQueues": {
> "queue": [
> {
> "allocatedContainers": 101,
> "amMaxResources": {
> "memory": 320390,
> "vCores": 192
> },
> "amUsedResources": {
> "memory": 1024,
> "vCores": 1
> },
> "clusterResources": {
> "memory": 640779,
> "vCores": 384
> },
> "demandResources": {
> "memory": 103424,
> "vCores": 101
> },
> "fairResources": {
> "memory": 640779,
> "vCores": 384
> },
> "maxApps": 2147483647,
> "maxResources": {
> "memory": 640779,
> "vCores": 384
> },
> "minResources": {
> "memory": 0,
> "vCores": 0
> },
> "numActiveApps": 1,
> "numPendingApps": 0,
> "preemptable": true,
> "queueName": "root.users.systest",
> "reservedContainers": 0,
> "reservedResources": {
> "memory": 0,
> "vCores": 0
> },
> "schedulingPolicy": "fair",
> "steadyFairResources": {
> "memory": 320390,
> "vCores": 192
> },
> "type": "fairSchedulerLeafQueueInfo",
> "usedResources": {
> "memory": 103424,
> "vCores": 101
> }
> }
> ]
> {noformat}
> However, the web UI shows resource types usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed

2018-01-14 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325654#comment-16325654
 ] 

Haibo Chen commented on YARN-7748:
--

I suspect the failure has something to with more than one  
AppAttemptRemovedSchedulerEvent are generated in the test. There are two lines 
of log,
{code}
2018-01-14 09:45:16,228 INFO  [main] capacity.LeafQueue 
(LeafQueue.java:removeApplicationAttempt(973)) - Application removed - appId: 
application_1515923115995_0001 user: user queue: default 
#user-pending-applications: 0 #user-active-applications: 0 
#queue-pending-applications: 0 #queue-active-applications: 0
{code}
and 
{code}
2018-01-14 09:45:16,229 INFO  [AsyncDispatcher event handler] 
capacity.LeafQueue (LeafQueue.java:removeApplicationAttempt(973)) - Application 
removed - appId: application_1515923115995_0001 user: user queue: default 
#user-pending-applications: -1 #user-active-applications: 0 
#queue-pending-applications: 0 #queue-active-applications: 0
{code}

> TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted 
> failed
> 
>
> Key: YARN-7748
> URL: https://issues.apache.org/jira/browse/YARN-7748
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.0.0
>Reporter: Haibo Chen
>
> TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted
> Failing for the past 1 build (Since Failed#19244 )
> Took 0.4 sec.
> *Error Message*
> expected null, but 
> was:
> *Stacktrace*
> {code}
> java.lang.AssertionError: expected null, but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotNull(Assert.java:664)
>   at org.junit.Assert.assertNull(Assert.java:646)
>   at org.junit.Assert.assertNull(Assert.java:656)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted(TestContainerResizing.java:826)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed

2018-01-14 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325652#comment-16325652
 ] 

Haibo Chen commented on YARN-7748:
--

{code}
2018-01-14 09:45:15,849 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(453)) - Service: 
org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager entered state INITED
2018-01-14 09:45:15,865 INFO  [main] conf.Configuration 
(Configuration.java:getConfResourceAsInputStream(2656)) - resource-types.xml 
not found
2018-01-14 09:45:15,865 INFO  [main] resource.ResourceUtils 
(ResourceUtils.java:addResourcesFileToConf(395)) - Unable to find 
'resource-types.xml'.
2018-01-14 09:45:15,865 DEBUG [main] resource.ResourceUtils 
(ResourceUtils.java:addMandatoryResources(127)) - Adding resource type - name = 
memory-mb, units = Mi, type = COUNTABLE
2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils 
(ResourceUtils.java:addMandatoryResources(137)) - Adding resource type - name = 
vcores, units = , type = COUNTABLE
2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils 
(ResourceUtils.java:getAllocation(177)) - Mandatory Resource 
'yarn.resource-types.memory-mb.minimum-allocation' is not configured in 
resource-types config file. Setting allocation specified using 
'yarn.scheduler.minimum-allocation-mb'
2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils 
(ResourceUtils.java:getAllocation(177)) - Mandatory Resource 
'yarn.resource-types.memory-mb.maximum-allocation' is not configured in 
resource-types config file. Setting allocation specified using 
'yarn.scheduler.maximum-allocation-mb'
2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils 
(ResourceUtils.java:getAllocation(177)) - Mandatory Resource 
'yarn.resource-types.vcores.minimum-allocation' is not configured in 
resource-types config file. Setting allocation specified using 
'yarn.scheduler.minimum-allocation-vcores'
2018-01-14 09:45:15,866 DEBUG [main] resource.ResourceUtils 
(ResourceUtils.java:getAllocation(177)) - Mandatory Resource 
'yarn.resource-types.vcores.maximum-allocation' is not configured in 
resource-types config file. Setting allocation specified using 
'yarn.scheduler.maximum-allocation-vcores'
2018-01-14 09:45:15,866 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(453)) - Service: ResourceManager entered state 
INITED
2018-01-14 09:45:15,867 INFO  [main] conf.Configuration 
(Configuration.java:getConfResourceAsInputStream(2659)) - found resource 
core-site.xml at 
file:/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml
2018-01-14 09:45:15,875 DEBUG [main] 
security.JniBasedUnixGroupsMappingWithFallback 
(JniBasedUnixGroupsMappingWithFallback.java:(45)) - Group mapping 
impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
2018-01-14 09:45:15,876 DEBUG [main] security.Groups (Groups.java:(150)) 
- Group mapping 
impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; 
cacheTimeout=30; warningDeltaMs=5000
2018-01-14 09:45:15,876 INFO  [main] security.Groups (Groups.java:refresh(401)) 
- clearing userToGroupsMap cache
2018-01-14 09:45:15,878 INFO  [main] conf.Configuration 
(Configuration.java:getConfResourceAsInputStream(2659)) - found resource 
yarn-site.xml at 
file:/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml
2018-01-14 09:45:15,887 INFO  [main] event.AsyncDispatcher 
(AsyncDispatcher.java:register(223)) - Registering class 
org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher
2018-01-14 09:45:15,887 DEBUG [main] service.CompositeService 
(CompositeService.java:addService(74)) - Adding service Dispatcher
2018-01-14 09:45:15,887 DEBUG [main] service.CompositeService 
(CompositeService.java:addService(74)) - Adding service 
org.apache.hadoop.yarn.server.resourcemanager.AdminService
2018-01-14 09:45:15,888 DEBUG [main] service.AbstractService 
(AbstractService.java:enterState(453)) - Service: RMActiveServices entered 
state INITED
2018-01-14 09:45:15,888 INFO  [main] security.NMTokenSecretManagerInRM 
(NMTokenSecretManagerInRM.java:(75)) - NMTokenKeyRollingInterval: 
8640ms and NMTokenKeyActivationDelay: 90ms
2018-01-14 09:45:15,889 INFO  [main] security.RMContainerTokenSecretManager 
(RMContainerTokenSecretManager.java:(79)) - 
ContainerTokenKeyRollingInterval: 8640ms and 
ContainerTokenKeyActivationDelay: 90ms
2018-01-14 09:45:15,890 INFO  [main] security.AMRMTokenSecretManager 
(AMRMTokenSecretManager.java:(94)) - AMRMTokenKeyRollingInterval: 
8640ms and AMRMTokenKeyActivationDelay: 90 ms
2018-01-14 09:45:15,891 DEBUG [main] service.CompositeService 
(CompositeService.java:addService(74)) - Adding service

[jira] [Created] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed

2018-01-14 Thread Haibo Chen (JIRA)

Haibo Chen created YARN-7748:


 Summary: 
TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted 
failed
 Key: YARN-7748
 URL: https://issues.apache.org/jira/browse/YARN-7748
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 3.0.0
Reporter: Haibo Chen


TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted
Failing for the past 1 build (Since Failed#19244 )
Took 0.4 sec.

*Error Message*
expected null, but 
was:

*Stacktrace*
{code}
java.lang.AssertionError: expected null, but 
was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotNull(Assert.java:664)
at org.junit.Assert.assertNull(Assert.java:646)
at org.junit.Assert.assertNull(Assert.java:656)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted(TestContainerResizing.java:826)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5473) Expose per-application over-allocation info in the Resource Manager

2018-01-14 Thread Haibo Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325649#comment-16325649
 ] 

Haibo Chen commented on YARN-5473:
--

Updated the patch to address the new findbug issues as well as the some 
checkstyle issues. The unit test failure and license problem is unrelated.


> Expose per-application over-allocation info in the Resource Manager
> ---
>
> Key: YARN-5473
> URL: https://issues.apache.org/jira/browse/YARN-5473
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Haibo Chen
> Attachments: YARN-5473-YARN-1011.00.patch, 
> YARN-5473-YARN-1011.01.patch, YARN-5473-YARN-1011.prelim.patch
>
>
> When enabling over-allocation of nodes, the resources in the cluster change. 
> We need to surface this information for users to understand these changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5473) Expose per-application over-allocation info in the Resource Manager

2018-01-14 Thread Haibo Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5473:
-
Attachment: YARN-5473-YARN-1011.01.patch

> Expose per-application over-allocation info in the Resource Manager
> ---
>
> Key: YARN-5473
> URL: https://issues.apache.org/jira/browse/YARN-5473
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Haibo Chen
> Attachments: YARN-5473-YARN-1011.00.patch, 
> YARN-5473-YARN-1011.01.patch, YARN-5473-YARN-1011.prelim.patch
>
>
> When enabling over-allocation of nodes, the resources in the cluster change. 
> We need to surface this information for users to understand these changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5473) Expose per-application over-allocation info in the Resource Manager

2018-01-14 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16325540#comment-16325540
 ] 

genericqa commented on YARN-5473:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 23 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-1011 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
12s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
40s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
21s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
50s{color} | {color:green} YARN-1011 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in 
YARN-1011 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
41s{color} | {color:green} YARN-1011 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 11m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
14s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 23s{color} | {color:orange} root: The patch generated 10 new + 1770 
unchanged - 20 fixed = 1780 total (was 1790) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 32s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
26s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
42s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
19s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
47s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 41s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 23m 
50s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
29s{color} | {color:green} hadoop-yarn-server-router in the

[jira] [Updated] (YARN-7563) Invalid event: FINISH_APPLICATION at NEW may make some application level resource be not cleaned

2018-01-14 Thread lujie (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lujie updated YARN-7563:

Affects Version/s: 2.8.0

> Invalid event: FINISH_APPLICATION at NEW  may make some application level 
> resource be not cleaned
> -
>
> Key: YARN-7563
> URL: https://issues.apache.org/jira/browse/YARN-7563
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.0, 3.0.0-beta1
>Reporter: lujie
>Assignee: lujie
> Attachments: YARN-7563.png, YARN-7563.txt
>
>
> I send kill command to application, nodemanager log shows:
> {code:java}
> 2017-11-25 19:18:48,126 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
>  couldn't find container container_1511608703018_0001_01_01 while 
> processing FINISH_CONTAINERS event
> 2017-11-25 19:18:48,146 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl:
>  Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
> FINISH_APPLICATION at NEW
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:627)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:75)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1508)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1501)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:745)
> 2017-11-25 19:18:48,151 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl:
>  Application application_1511608703018_0001 transitioned from NEW to INITING
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7620) Allow node partition filters on Queues page of new YARN UI

[jira] [Commented] (YARN-7620) Allow node partition filters on Queues page of new YARN UI

[jira] [Commented] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

[jira] [Commented] (YARN-7479) TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk

[jira] [Updated] (YARN-7479) TestContainerManagerSecurity.testContainerManager[Simple] flaky in trunk

[jira] [Created] (YARN-7750) Render time in the users timezone

[jira] [Commented] (YARN-7749) [UI2] GPU information tab in left hand side disappears when we click other tabs below

[jira] [Created] (YARN-7749) [UI2] GPU information tab in left hand side disappears when we click other tabs below

[jira] [Comment Edited] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks

[jira] [Commented] (YARN-7655) avoid AM preemption caused by RRs for specific nodes or racks

[jira] [Commented] (YARN-7746) Minor fixes to PlacementProcessor to support app priority

[jira] [Updated] (YARN-7746) Minor fixes to PlacementProcessor to support app priority

[jira] [Updated] (YARN-7746) Minor fixes to PlacementProcessor to support app priority

[jira] [Comment Edited] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

[jira] [Updated] (YARN-6619) AMRMClient Changes to use the PlacementConstraint and SchcedulingRequest objects

[jira] [Comment Edited] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

[jira] [Commented] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

[jira] [Commented] (YARN-7346) Fix compilation errors against hbase2 beta release

[jira] [Comment Edited] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

[jira] [Updated] (YARN-7451) Resources Types should be visible in the Cluster Apps API "resourceRequests" section

[jira] [Commented] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed

[jira] [Commented] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed

[jira] [Created] (YARN-7748) TestContainerResizing.testIncreaseContainerUnreservedWhenApplicationCompleted failed

[jira] [Commented] (YARN-5473) Expose per-application over-allocation info in the Resource Manager

[jira] [Updated] (YARN-5473) Expose per-application over-allocation info in the Resource Manager

[jira] [Commented] (YARN-5473) Expose per-application over-allocation info in the Resource Manager

[jira] [Updated] (YARN-7563) Invalid event: FINISH_APPLICATION at NEW may make some application level resource be not cleaned

27 matches

Site Navigation

Mail list logo

Footer information