[ 
https://issues.apache.org/jira/browse/YARN-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13645288#comment-13645288
 ] 

Vinod Kumar Vavilapalli commented on YARN-599:
----------------------------------------------

Hm, it isn't straight-forward to figure that failures during 
RMAppManager.submitApplication() are properly put in Audit logs. But they are, 
I just verified.

The latest patch looks good to me. +1, checking it in..
                
> Refactoring submitApplication in ClientRMService and RMAppManager
> -----------------------------------------------------------------
>
>                 Key: YARN-599
>                 URL: https://issues.apache.org/jira/browse/YARN-599
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: YARN-599.1.patch, YARN-599.2.patch
>
>
> Currently, ClientRMService#submitApplication call RMAppManager#handle, and 
> consequently call RMAppMangager#submitApplication directly, though the code 
> looks like scheduling an APP_SUBMIT event.
> In addition, the validation code before creating an RMApp instance is not 
> well organized. Ideally, the dynamic validation, which depends on the RM's 
> configuration, should be put in RMAppMangager#submitApplication. 
> RMAppMangager#submitApplication is called by 
> ClientRMService#submitApplication and RMAppMangager#recover. Since the 
> configuration may be changed after RM restarts, the validation needs to be 
> done again even in recovery mode. Therefore, resource request validation, 
> which based on min/max resource limits, should be moved from 
> ClientRMService#submitApplication to RMAppMangager#submitApplication. On the 
> other hand, the static validation, which is independent of the RM's 
> configuration should be put in ClientRMService#submitApplication, because it 
> is only need to be done once during the first submission.
> Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. 
> RMAppMangager#submitApplication has a flaw is not synchronized. If two 
> application submissions with the same application ID enter the function, and 
> one progresses to the completion of RMApp instantiation, and the other 
> progresses the completion of putting the RMApp instance into rmContext, the 
> slower submission will cause an exception due to the duplicate application 
> ID. However, the exception will cause the RMApp instance already in rmContext 
> (belongs to the faster submission) being rejected with the current code flow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to