[ https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138388#comment-17138388 ]
Peter Bacsko commented on YARN-9930: ------------------------------------ So after talkin about the problems over voice chat, here is our conclusion: _"So AFAIU it is absolutely normal that some queue is above its limit if the configurations have been changed. Doesn't it need some special attention in your algorithm when you recursively update the parents to search for queues where new apps could be submitted?"_ No, I tested this case manually, first 4 running apps were allowed to run, but no more. Then it went down to 3, then to 2. After that, it stayed at 2 running apps and everything else was accepted. Functionality was consistent during the test run. _"I'd prefer your solution as its more clear, but since we already have the existing logic, the questions arises: why do we need a separate enforcer object? Couldn't it be implemented similarly? Or am I missing something here?"_ Yes, this approach is different from max-applications calculation. In theory, having a consistent implementation accross a module is often desirable, but this patch duplicates a battle-tested algorithm from {{MaxRunningAppsEnforcer}} which was then adapted to CS. So this class can be trusted. Rewriting the current patch would take a lot of time. I'm being very practical here, but I don't think it's a huge violation of coding principles (apart from the duplication, but that was also necessary IMO). _"The existing implementation for max apps (that considers both running and pending ones) calls the {{OrderingPolicy#getNumSchedulableEntities()}} and compare it the to limit inside {{LeafQueue"}}_ This could be a bug! Apps that were marked as non-runnable are actually missing from {{schedulableEntities}} (precisely to prevent them from being scheduled). Looks like this needs a little change plus a unit test. [~bteke]'s comments are also valid. I'll address these issues and upload patch v5 soon. > Support max running app logic for CapacityScheduler > --------------------------------------------------- > > Key: YARN-9930 > URL: https://issues.apache.org/jira/browse/YARN-9930 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler, capacityscheduler > Affects Versions: 3.1.0, 3.1.1 > Reporter: zhoukang > Assignee: Peter Bacsko > Priority: Major > Attachments: YARN-9930-001.patch, YARN-9930-002.patch, > YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-POC01.patch, > YARN-9930-POC02.patch, YARN-9930-POC03.patch, YARN-9930-POC04.patch, > YARN-9930-POC05.patch, screenshot-1.png > > > In FairScheduler, there has limitation for max running which will let > application pending. > But in CapacityScheduler there has no feature like max running app.Only got > max app,and jobs will be rejected directly on client. > This jira i want to implement this semantic for CapacityScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org