21 juin 2016 14:13 "Yannis Aribaud" <b...@d6bell.net> a écrit:
> Hi everyone,
> 
> I recently it this bug in the kernel using a vanilla 4.6.2 release.
> It seems that somewhere in the load average calculation a division by 0 
> occurs (see the stack trace
> at the end).
>
> [snipped]
> 
> I'm not an expert at all but I suspect that is the issue's origin. Shouldn't 
> the function
> cfs_rq_load_avg use an atomic_long_read() to avoid this ?

After digging a bit more, this can't be the problem as this function obviously 
can't return negative value.

I found that it can maybe come from the update_cfs_rq_load_avg function in the 
following block:

        if (atomic_long_read(&cfs_rq->removed_load_avg)) {
                s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
                sa->load_avg = max_t(long, sa->load_avg - r, 0);
                sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
                removed_load = 1;
        }

The max_t(long, sa->load_avg - r, 0) can result in a negative value keeped by 
the max_t function as the long would wrap up then generate a division by zero 
in task_h_load function.

Best regards,
--
Yannis Aribaud

Reply via email to