[ 
https://issues.apache.org/jira/browse/YARN-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-8290:
-------------------------------

             Assignee: Eric Yang
    Affects Version/s: 3.1.1

[~leftnoteasy] According to your suggestion that ACL information is set too 
late and killing AM prior to ACL information is propagated can cause RM 
recovery to load partial application record.  The suggested change is to move 
the ACL setup into ApplicationToSchedulerTransition.  The patch moved the block 
of code accordingly.  Let me know if this is the correct fix.  Thanks

> Yarn application failed to recover with "Error Launching job : User is not 
> set in the application report" error after RM restart
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-8290
>                 URL: https://issues.apache.org/jira/browse/YARN-8290
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.1.1
>            Reporter: Yesha Vora
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: YARN-8290.001.patch
>
>
> Scenario:
> 1) Start 5 streaming application in background
> 2) Kill Active RM and cause RM failover
> After RM failover, The application failed with below error.
> {code}18/02/01 21:24:29 WARN client.RequestHedgingRMFailoverProxyProvider: 
> Invocation returned exception on [rm2] : 
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1517520038847_0003' doesn't exist in RM. Please check 
> that the job submission was successful.
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:338)
>       at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
>       at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
> , so propagating back to caller.
> 18/02/01 21:24:29 INFO impl.YarnClientImpl: Submitted application 
> application_1517520038847_0003
> 18/02/01 21:24:30 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /user/hrt_qa/.staging/job_1517520038847_0003
> 18/02/01 21:24:30 ERROR streaming.StreamJob: Error Launching job : User is 
> not set in the application report
> Streaming Command Failed!{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to