Hi Morten, On 5 May 2016 at 13:13, Morten Rasmussen <morten.rasmus...@arm.com> wrote: > On Wed, May 04, 2016 at 04:02:44AM +0800, Yuyang Du wrote: >> In sched average update, a period is about 1ms, so a 32-bit unsigned >> integer can approximately hold a maximum of 49 (=2^32/1000/3600/24) >> days. >> >> For usual cases, 32bit is big enough and 64bit is needless. But if >> a task sleeps longer than it, there can be two outcomes: >> >> Consider a task sleeps whatever m milliseconds, then n = (u32)m. > > Maybe it crystal clear that you mean: For any m > UINT_MAX? > >> >> 1. If n >= 32*64, then the task's sched avgs will be surely decayed >> to 0. In this case, it really doesn't matter that the 32bit is not >> big enough to hold m. In other words, a task sleeps 2 secs or sleeps >> 50 days are the same from sched average point of view. >> >> 2. If m < 32*64, the chance to be here is very very low, but if so, > > Should that be: n < 32*64 ? > > Talking about 32*64, I don't get why we don't use LOAD_AVG_MAX_N. I had > a patch ready to post for that: > > From 5055e5f82c8d207880035c2ec4ecf1ac1e7f9e91 Mon Sep 17 00:00:00 2001 > From: Morten Rasmussen <morten.rasmus...@arm.com> > Date: Mon, 11 Apr 2016 15:41:37 +0100 > Subject: [PATCH] sched/fair: Fix decay to zero period in decay_load() > > In __compute_runnable_contrib() we are happy with returning LOAD_AVG_MAX > when the update period n >= LOAD_AVG_MAX_N (=345), so we should be happy > with returning zero for n >= LOAD_AVG_MAX_N when decaying in > decay_load() as well instead of only returning zero for n > > LOAD_AVG_PERIOD * 63 (=2016). > > As both conditions are unlikely() the impact is quite likely very small, > but at least it makes the rounding errors for load accumulation and > decay symmetrical. > > Signed-off-by: Morten Rasmussen <morten.rasmus...@arm.com> > --- > kernel/sched/fair.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 56b7d4b..42515b6 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2527,7 +2527,7 @@ static __always_inline u64 decay_load(u64 val, u64 n) > > if (!n) > return val; > - else if (unlikely(n > LOAD_AVG_PERIOD * 63)) > + else if (unlikely(n > LOAD_AVG_MAX_N))
I had the same question about this 63 and IIUC, the 63 comes from 64bits-1. The load can be that high that the 64bits are used. We shift right every LOAD_AVG_PERIOD so we will be sure that the value will be 0 after shifting all 64 bits > return 0; > > /* after bounds checking we can collapse to 32-bit */ > -- > 1.9.1 > > >> the task's sched avgs MAY NOT be decayed to 0, depending on how >> big its sums are, and the chance to 0 is still good as load_sum >> is way less than ~0ULL and util_sum way less than ~0U. > > I don't get the last bit about load_sum < ~0ULL and util_sum < ~0U. > Whether you get to zero depends on the sums (as you say) and the actual > value of 'n'. It is true that you might get to zero even if n < > LOAD_AVG_MAX_N if the sums are small. > >> Nevertheless, what really maters is what happens in the worst-case >> scenario, which is when (u32)m = 0? So in that case, it would be like >> after so long a sleep, we treat the task as it never slept, and has the >> same sched averages as before it slept. That is actually not bad or >> nothing to worry about, and think of it as the same as how we treat >> a new born task. > > There is subtle but important difference between not decaying a task > correctly and adding new task: The sleeping task is already accounted > for in the cfs_rq.avg.{load,util}_sum. The sleeping task's contribution > to cfs_rq.avg has been decayed correctly in the mean time which means > that by not guaranteeing a decay of the se.avg at wake-up you introduce > a discrepancy between the task's owen view of its contribution (se.avg) > and the cfs_rq view (cfs_rq.avg). That might lead to trouble if the task > is migrated at wake-up as we remove se.avg amount of contribution from > the previous cpu's cfs_rq.avg. In other words, you remove too much from > the cfs_rq.avg. > > The discrepancy will decay and disappear after LOAD_AVG_MAX_N ms, which > might be acceptable, but it is not a totally harmless change IMHO.