[ https://issues.apache.org/jira/browse/YARN-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680725#comment-16680725 ]
Wilfred Spiegelenburg commented on YARN-8990: --------------------------------------------- thank you [~haibochen] for the quick review and checkin. [~sunilg] should this be added to 3.2 or is 3.2.1 good enough? > Fix fair scheduler race condition in app submit and queue cleanup > ----------------------------------------------------------------- > > Key: YARN-8990 > URL: https://issues.apache.org/jira/browse/YARN-8990 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 3.2.0 > Reporter: Wilfred Spiegelenburg > Assignee: Wilfred Spiegelenburg > Priority: Blocker > Fix For: 3.3.0 > > Attachments: YARN-8990.001.patch, YARN-8990.002.patch > > > With the introduction of the dynamic queue deletion in YARN-8191 a race > condition was introduced that can cause a queue to be removed while an > application submit is in progress. > The issue occurs in {{FairScheduler.addApplication()}} when an application is > submitted to a dynamic queue which is empty or the queue does not exist yet. > If during the processing of the application submit the > {{AllocationFileLoaderService}} kicks of for an update the queue clean up > will be run first. The application submit first creates the queue and get a > reference back to the queue. > Other checks are performed and as the last action before getting ready to > generate an AppAttempt the queue is updated to show the submitted application > ID.. > The time between the queue creation and the queue update to show the submit > is long enough for the queue to be removed. The application however is lost > and will never get any resources assigned. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org