Cong Wang <xiyou.wangc...@gmail.com> writes: > On Tue, Jul 31, 2018 at 10:13 AM <bseg...@google.com> wrote: >> >> Xunlei Pang <xlp...@linux.alibaba.com> writes: >> >> > On 7/31/18 1:55 AM, Cong Wang wrote: >> >> On Sun, Jul 29, 2018 at 10:29 PM Xunlei Pang <xlp...@linux.alibaba.com> >> >> wrote: >> >>> >> >>> Hi Cong, >> >>> >> >>> On 7/28/18 8:24 AM, Cong Wang wrote: >> >>>> Each time we sync cfs_rq->runtime_expires with cfs_b->runtime_expires, >> >>>> we should sync its ->expires_seq too. However it is missing >> >>>> for distribute_cfs_runtime(), especially the slack timer call path. >> >>> >> >>> I don't think it's a problem, as expires_seq will get synced in >> >>> assign_cfs_rq_runtime(). >> >> >> >> Sure, but there is a small window during which they are not synced. >> >> Why do you want to wait until the next assign_cfs_rq_runtime() when >> >> you already know runtime_expires is synced? >> >> >> >> Also, expire_cfs_rq_runtime() is called before assign_cfs_rq_runtime() >> >> inside __account_cfs_rq_runtime(), which means the check of >> >> cfs_rq->expires_seq is not accurate for unthrottling case if the clock >> >> drift happens soon enough? >> >> >> > >> > expire_cfs_rq_runtime(): >> > if (cfs_rq->expires_seq == cfs_b->expires_seq) { >> > /* extend local deadline, drift is bounded above by 2 ticks */ >> > cfs_rq->runtime_expires += TICK_NSEC; >> > } else { >> > /* global deadline is ahead, expiration has passed */ >> > cfs_rq->runtime_remaining = 0; >> > } >> > >> > So if clock drift happens soon, then expires_seq decides the correct >> > thing we should do: if cfs_b->expires_seq advanced, then clear the stale >> > cfs_rq->runtime_remaining from the slack timer of the past period, then >> > assign_cfs_rq_runtime() will refresh them afterwards, otherwise it is a >> > real clock drift. I am still not getting where the race is? > > But expires_seq is supposed to be the same here, after > distribute_cfs_runtime(), therefore runtime_remaining is not supposed > to be cleared. > > Which part do I misunderstand? expires_seq should not be same here? > Or you are saying a wrongly clear of runtime_remaning is fine? > > >> >> Nothing /important/ goes wrong because distribute_cfs_runtime only fills >> runtime_remaining up to 1, not a real amount. > > No, runtime_remaining is updated right before expire_cfs_rq_runtime(): > > static void __account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec) > { > /* dock delta_exec before expiring quota (as it could span periods) */ > cfs_rq->runtime_remaining -= delta_exec; > expire_cfs_rq_runtime(cfs_rq); > > so almost certainly it can't be 1.
Yes, in practice what's actually going to happen is that the runtime_remaining will be put to 1 by distribute, the cfs_rq will be unthrottled, and then when it runs it will go negative immediately and hit the negative check in expires, so expires_seq being wrong will not actually matter. In addition, the worst thing that will happen if one of the account_cfs_rq_runtime(cfs_rq, 0) paths is hit first is that it will lose 1ns of quota, which also doesn't really matter.