[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4606:
-----------------------------
    Summary: CapacityScheduler: applications could get starved because 
#activeUsers considers pending apps  (was: Sometimes Fairness inconjuncttions 
with UserLimitPercent and UserLimitFactor in queue leads to situation where it 
appears that applications in queue are getting starved or stuck)

> CapacityScheduler: applications could get starved because #activeUsers 
> considers pending apps
> ---------------------------------------------------------------------------------------------
>
>                 Key: YARN-4606
>                 URL: https://issues.apache.org/jira/browse/YARN-4606
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>            Assignee: Wangda Tan
>
> Encountered while studying behaviour fairness with UserLimitPercent and 
> UserLimitFactor during following test:
> Ran GridMix with Queue settings: Capacity=10, MaxCap=80, UserLimit=25 
> UserLimitFactor=32, FairOrderingPolicy only. Encountered a application 
> starving situation where 33 application (190 apps completed out of 761 apps, 
> queue can 345 containers) are running with total of 45 containers running, 
> and that 12 extra only one app(the app was having around 18000 tasks) , all 
> other apps were having AM running only no other containers were given any 
> apps. After that app finished, there were 32 AMs that kept running without 
> any containers for task being launched
> GridMix was run with following settings:
> gridmix.client.pending.queue.depth=10, gridmix.job-submission.policy=REPLAY, 
> gridmix.client.submit.threads=5, gridmix.submit.multiplier=0.0001, 
> gridmix.job.type=SLEEPJOB, mapreduce.framework.name=yarn, 
> mapreduce.job.queuename=hive1, mapred.job.queue.name=hive1, 
> gridmix.sleep.max-map-time=5000, gridmix.sleep.max-reduce-time=5000, 
> gridmix.user.resolve.class=org.apache.hadoop.mapred.gridmix.RoundRobinUserResolver
>  With Users file containing 4 users for RoundRobinUserResolver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to