[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267111#comment-17267111 ]
zhuqi commented on YARN-10532: ------------------------------ The latest patch, double check the "An additional requirement we should keep in mind: Scenario A: {code:java} - At time T0, policy signals scheduler to delete queue A (an auto created queue). - Before the signal arrives to scheduler, an app submitted to scheduler (T1). T1 > T0 - When at T2 (T2 > T1), the signal arrived at scheduler, scheduler should avoid removing the queue A because now it is used.{code} Scenario B: {code:java} - At time T0, policy signals scheduler to delete queue A (an auto created queue). - At T1 (T1 > T0), scheduler got the signal and deleted the queue. - At T2 (T2 > T1), an app submitted to scheduler. Scheduler should immediately recreate the queue, in another word, deleting an dynamic queue should NEVER fail a submitted application.{code} " This will not happen: Scenario A confirmed by : Double check before deletion, pass the latest last submitted time, and get before remove again and compare them. All will in the queue write lock. {code:java} // Double check for the lastSubmitTime has been expired. // In case if now, there is a new submitted app. if (queue instanceof LeafQueue && ((LeafQueue) queue).isDynamicQueue()) { LeafQueue underDeleted = (LeafQueue)queue; if (underDeleted.getLastSubmittedTimestamp() != lastSubmittedTime) { throw new SchedulerDynamicEditException("This should not happen, " + "trying to remove queue= " + childQueuePath + ", however the queue has new submitted apps."); } } else { throw new SchedulerDynamicEditException( "This should not happen, can't remove queue= " + childQueuePath + " is not a leafQueue or not a dynamic queue."); } // Now we can do remove and update this.childQueues.remove(queue); this.scheduler.getCapacitySchedulerQueueManager() .removeQueue(queue.getQueuePath()); {code} Signal will also update this in the write lock: {code:java} @Override public void submitApplication(ApplicationId applicationId, String userName, String queue) throws AccessControlException { // Careful! Locking order is important! validateSubmitApplication(applicationId, userName, queue); // Signal to queue submit time in dynamic queue if (this.isDynamicQueue()) { signalToSubmitToQueue(); } // Inform the parent queue try { getParent().submitApplication(applicationId, userName, queue); } catch (AccessControlException ace) { LOG.info("Failed to submit application to parent-queue: " + getParent().getQueuePath(), ace); throw ace; } } // "Tab" the queue, so this queue won't be removed because of idle timeout. public void signalToSubmitToQueue() { writeLock.lock(); try { this.lastSubmittedTimestamp = System.currentTimeMillis(); } finally { writeLock.unlock(); } } {code} Scenario B confirmed by : in addApplication and addApplicationOnRecovery. {code:java} // If the queue has been deleted for expired. // - At time T0, policy signals scheduler to delete queue A (an auto created queue). // - At T1 (T1 > T0), scheduler got the signal and deleted the queue. // - At T2 (T2 > T1), an app submitted to scheduler. // // Scheduler should immediately recreate the queue, in another word, // deleting an dynamic queue should NEVER fail a submitted application. // This case queue may be null later // So add queue write lock here try { ((AbstractCSQueue) queue).writeLock.lock(); }...{code} > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > -------------------------------------------------------------------------------------------- > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Wangda Tan > Assignee: zhuqi > Priority: Major > Attachments: YARN-10532.001.patch, YARN-10532.002.patch, > YARN-10532.003.patch > > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org