[ https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805697#comment-13805697 ]
Omkar Vinit Joshi commented on YARN-674: ---------------------------------------- Thanks [~zjshen] for reviewing my patch bq. I think the exception needs to be thrown, which is missing in your patch. The exception will notice the client that the app submission fails; otherwise, the client will think the submission succeeds? Yes I have removed the error purposefully..here are the thoughts. * For client once he submits the application should check the app status and will come to know about the failing app from it. ** Either when parsing credentials fails. ** OR when initial token renewal fails. bq. Since DelegationTokenRenewer#addApplication becomes asynchronous, what will the impact of that the application is already accepted and starts its life cycle, while DelegationTokenRenewer is so slow to DelegationTokenRenewerAppSubmitEvent. Will the application fail somewhere else due to the fresh token unavailable? The logic here is modified a bit. If token renewal succeeds then only app is submitted to scheduler not before that. Today too it is the same case. Only problem is that we are holding client request while doing this. With the change this will become async. bq. I noticed testConncurrentAddApplication has been removed. Does the change affect the current app submission? No. Now there is no problem w.r.t. concurrent app submission as we are anyway funneling it through event handler. This test is no longer required so removed it completely. * Fixing findbug warnings... * fixing failed test case... > Slow or failing DelegationToken renewals on submission itself make RM > unavailable > --------------------------------------------------------------------------------- > > Key: YARN-674 > URL: https://issues.apache.org/jira/browse/YARN-674 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Vinod Kumar Vavilapalli > Assignee: Omkar Vinit Joshi > Attachments: YARN-674.1.patch > > > This was caused by YARN-280. A slow or a down NameNode for will make it look > like RM is unavailable as it may run out of RPC handlers due to blocked > client submissions. -- This message was sent by Atlassian JIRA (v6.1#6144)