[ https://issues.apache.org/jira/browse/YARN-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Akira Ajisaka resolved YARN-8990. --------------------------------- Fix Version/s: 3.2.3 Resolution: Fixed Backported to branch-3.2. > Fix fair scheduler race condition in app submit and queue cleanup > ----------------------------------------------------------------- > > Key: YARN-8990 > URL: https://issues.apache.org/jira/browse/YARN-8990 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler > Affects Versions: 3.2.0 > Reporter: Wilfred Spiegelenburg > Assignee: Wilfred Spiegelenburg > Priority: Blocker > Labels: pull-request-available, release-blocker > Fix For: 3.2.3, 3.3.0, 3.2.0 > > Attachments: YARN-8990.001.patch, YARN-8990.002.patch > > Time Spent: 50m > Remaining Estimate: 0h > > With the introduction of the dynamic queue deletion in YARN-8191 a race > condition was introduced that can cause a queue to be removed while an > application submit is in progress. > The issue occurs in {{FairScheduler.addApplication()}} when an application is > submitted to a dynamic queue which is empty or the queue does not exist yet. > If during the processing of the application submit the > {{AllocationFileLoaderService}} kicks of for an update the queue clean up > will be run first. The application submit first creates the queue and get a > reference back to the queue. > Other checks are performed and as the last action before getting ready to > generate an AppAttempt the queue is updated to show the submitted application > ID.. > The time between the queue creation and the queue update to show the submit > is long enough for the queue to be removed. The application however is lost > and will never get any resources assigned. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org