On Mon, Jul 16, 2018 at 07:08:41AM +0000, Xiexiangyou wrote: > Virtual machine has cgroup hierarchies as follow: > > root > > | > > vm_tg > > (cfs_rq) > > / \ > > (se) (se) > > tg_A tg_B > > (cfs_rq) (cfs_rq) > > / \ > > (se) (se) > > a b > > A and B are two vcpus of the VM. > > > > We set cfs quota on vm_tg, and the schedule latency of vcpu(a/b) may become > very large, up to more than 2S. > > > > Shows Perf sched test result: > > Task | Runtime ms | Switches | Average delay ms | Maximum > delay ms | Maximum delay at | > > ----------------------------------------------------------------------------------------------------------------- > > CPU 0/KVM:49609 | 260.261 ms | 50 | avg: 82.017 ms | max: > 2510.990 ms | max at: 43335.555886 s > > ..... > > > > We add some trace points, found the sequence as follows will lead to the > issue: > > - 'a' is only task of tg_A, when 'a' go to sleep, tg_A is dequeued, > and tg_A->se->load.weight = MIN_SHARES. > > - 'b' continue running, then trigger throttle. > tg_A->cfs_rq->throttle_count=1 > > - some task wakeup process 'a', When enqueue tg_A, > tg_A->se->load.weight can't be updated because tg_A->cfs_rq->throttle_count=1 > > - after one cfs quota period, vm_tg is unthrottled > > - 'a' is running > > - after one tick, when update tg_A->se's vruntime, > tg_A->se->load.weight is still MIN_SHARES, lead tg_A->se's vruntime has grown > a large value. > > - That will cause 'a' to have a large schedule latancy. > > The fix patch as follows: > > Signed-off-by: Xiangyou Xie > <[email protected]<mailto:[email protected]>>
The above Changelog violates just about every formatting rule ever invented. Also you got your email format wrong. The patch might be OK, but at this point I really can't do anything with it anyway. > --- > kernel/sched/fair.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 2f0a0be..348ccd6 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3016,9 +3016,6 @@ static void update_cfs_group(struct sched_entity *se) > if (!gcfs_rq) > return; > > - if (throttled_hierarchy(gcfs_rq)) > - return; > - > #ifndef CONFIG_SMP > runnable = shares = READ_ONCE(gcfs_rq->tg->shares); > > -- > 1.8.3.1 >

