[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419321#comment-15419321
 ] 

Varun Saxena commented on YARN-5476:


Committed to trunk, branch-2 and branch-2.8
Thanks [~djp] for your contribution and [~yeshavora] for filing the issue.
Thanks [~rohithsharma] and [~naganarasimha...@apache.org] for the reviews.

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-5476-branch-2.patch, YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:481)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:160)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419272#comment-15419272
 ] 

Hadoop QA commented on YARN-5476:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 48s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
26s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s 
{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
13s {color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 133 unchanged - 8 fixed = 133 total (was 141) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 0s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed 
with JDK v1.8.0_101. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 38m 28s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed 
with JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 113m 59s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:b59b8b7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12823499/YARN-5476-branch-2.patch
 |
| JIRA Issue | YARN-5476 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4deea647854e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419098#comment-15419098
 ] 

Junping Du commented on YARN-5476:
--

Sure. Just put a patch for branch-2. Thanks!

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-5476-branch-2.patch, YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:481)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:160)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:83)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationN

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419008#comment-15419008
 ] 

Varun Saxena commented on YARN-5476:


[~djp], can you provide a patch for branch-2. Current patch doesnt apply 
cleanly.

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:481)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:160)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:83)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.excepti

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418989#comment-15418989
 ] 

Hudson commented on YARN-5476:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10269 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10269/])
YARN-5476. Non existent application reported as ACCEPTED by (varunsaxena: rev 
23c6e3c4e41fecc61d062542cb61e68898235006)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java


> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apach

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418918#comment-15418918
 ] 

Naganarasimha G R commented on YARN-5476:
-

Thanks for the patch [~djp],
 +1 patch LGTM

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:481)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:160)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:83)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundExcept

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-12 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418853#comment-15418853
 ] 

Rohith Sharma K S commented on YARN-5476:
-

+1 LGTM too, thanks Junping for the patch

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:481)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:160)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:83)
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundExceptio

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417957#comment-15417957
 ] 

Varun Saxena commented on YARN-5476:


Thanks Junping for the patch. Changes look good to me.
Test failure is not related (tracked by YARN-5491).

Will commit it later today unless there are further comments.

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
> Attachments: YARN-5476.patch
>
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
>   at com.sun.proxy.$Proxy18.getApplicationReport(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:436)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:481)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:160)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>   at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:8

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417942#comment-15417942
 ] 

Hadoop QA commented on YARN-5476:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 24s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 0 new + 141 unchanged - 8 fixed = 141 total (was 149) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 37m 22s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 3s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12823335/YARN-5476.patch |
| JIRA Issue | YARN-5476 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 562fb68b6ce6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 874577a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/12747/artifact/patchprocess/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/12747/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/12747/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/12747/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-serv

[jira] [Commented] (YARN-5476) Not existed application reported as ACCEPTED state by YarnClientImpl

2016-08-09 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414043#comment-15414043
 ] 

Junping Du commented on YARN-5476:
--

After discussed with Yesha, we found the root cause here is because:
1. yarn client looping in submit application until it get ACCEPTED status from 
getApplicationReport(). If getApplicationReport() return ApplicationNoFound 
exception, it will go ahead to resubmit the application.
2. The call to getApplicationReport() will first go to check RM, if RM return 
ApplicationNoFound, it means RM doesn't have any info about this application. 
Basically, two possibility here: a. app is finished and RM remove track for 
this; b. app info haven't get persistent to RMStateStore before RM fail 
over/restart. Here the case belongs to case b.
3. Although app info haven't get persistent into RMStateStore yet, the app 
event already sent to ATS for handling so ATS will record this app and its 
initiated state - ACCEPTED. so getApplicationReport() will return ACCEPTED, and 
yarn client quit the loop in submit application but actually the app is already 
forgotten by RM.

As a quick solution, we should move RM notify ATS later to wait at least 
NEW_SAVING states so RM state store get persistent on this application already.

> Not existed application reported as ACCEPTED state by YarnClientImpl
> 
>
> Key: YARN-5476
> URL: https://issues.apache.org/jira/browse/YARN-5476
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Yesha Vora
>Assignee: Junping Du
>Priority: Critical
>
> Steps To reproduce: 
> * Create a cluster with RM HA enabled
> * Start a yarn application
> * When yarn application is in NEW state, do RM failover. 
> In this case, the application gets "ApplicationNotFound" exception from YARN. 
> and it goes to accepted state and gets stuck. 
> At this point, if yarn application -status  is run, it says that 
> application is in ACCEPTED state. 
> This state is misleading. 
> {code}
> hrt_qa@xxx:/root> yarn application -status application_1470379565464_0001
> 16/08/05 17:24:29 INFO impl.TimelineClientImpl: Timeline service address: 
> https://xxx:8190/ws/v1/timeline/
> 16/08/05 17:24:30 INFO client.AHSProxy: Connecting to Application History 
> server at xxx/xxx:10200
> 16/08/05 17:24:31 WARN retry.RetryInvocationHandler: Exception while invoking 
> ApplicationClientProtocolPBClientImpl.getApplicationReport over rm1. Not 
> retrying because try once and fail.
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1470379565464_0001' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:331)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache