[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138388#comment-17138388
 ] 

Peter Bacsko commented on YARN-9930:
------------------------------------

So after talkin about the problems over voice chat, here is our conclusion:

_"So AFAIU it is absolutely normal that some queue is above its limit if the 
configurations have been changed. Doesn't it need some special attention in 
your algorithm when you recursively update the parents to search for queues 
where new apps could be submitted?"_

No, I tested this case manually, first 4 running apps were allowed to run, but 
no more. Then it went down to 3, then to 2. After that, it stayed at 2 running 
apps and everything else was accepted. Functionality was consistent during the 
test run.

 

_"I'd prefer your solution as its more clear, but since we already have the 
existing logic, the questions arises: why do we need a separate enforcer 
object? Couldn't it be implemented similarly? Or am I missing something here?"_

Yes, this approach is different from max-applications calculation. In theory, 
having a consistent implementation accross a module is often desirable, but 
this patch duplicates a battle-tested algorithm from {{MaxRunningAppsEnforcer}} 
which was then adapted to CS. So this class can be trusted. Rewriting the 
current patch would take a lot of time. I'm being very practical here, but I 
don't think it's a huge violation of coding principles (apart from the 
duplication, but that was also necessary IMO).

 

_"The existing implementation for max apps (that considers both running and 
pending ones) calls the {{OrderingPolicy#getNumSchedulableEntities()}} and 
compare it the to limit inside {{LeafQueue"}}_

This could be a bug! Apps that were marked as non-runnable are actually missing 
from {{schedulableEntities}} (precisely to prevent them from being scheduled). 
Looks like this needs a little change plus a unit test.

[~bteke]'s comments are also valid.

I'll address these issues and upload patch v5 soon.

> Support max running app logic for CapacityScheduler
> ---------------------------------------------------
>
>                 Key: YARN-9930
>                 URL: https://issues.apache.org/jira/browse/YARN-9930
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 3.1.0, 3.1.1
>            Reporter: zhoukang
>            Assignee: Peter Bacsko
>            Priority: Major
>         Attachments: YARN-9930-001.patch, YARN-9930-002.patch, 
> YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-POC01.patch, 
> YARN-9930-POC02.patch, YARN-9930-POC03.patch, YARN-9930-POC04.patch, 
> YARN-9930-POC05.patch, screenshot-1.png
>
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to