On Mon, 1 Dec 2014 17:10:34 +0100
Frederic Weisbecker <[email protected]> wrote:

> Speaking about the degradation in s390:
> 
> s390 is really a special case. And it would be a shame if we prevent from a
> real core cleanup just for this special case especially as it's fairly 
> possible
> to keep a specific treatment for s390 in order not to impact its performances
> and time precision. We could simply accumulate the cputime in per-cpu values:
> 
> struct s390_cputime {
>        cputime_t user, sys, softirq, hardirq, steal;
> }
> 
> DEFINE_PER_CPU(struct s390_cputime, s390_cputime);
> 
> Then on irq entry/exit, just add the accumulated time to the relevant buffer
> and account for real (through any account_...time() functions) only on tick
> and task switch. There the costly operations (unit conversion and call to
> account_...._time() functions) are deferred to a rarer yet periodic enough
> event. This is what s390 does already for user/system time and kernel
> boundaries.
> 
> This way we should even improve the situation compared to what we have
> upstream. It's going to be faster because calling the accounting functions
> can be costlier than simple per-cpu ops. And also we keep the cputime_t
> granularity. For archs like s390 which have a granularity higher than nsecs,
> we can have:
> 
>    u64 cputime_to_nsecs(cputime_t time, u64 *rem);
> 
> And to avoid remainder losses, we can do that from the tick:
> 
>     delta_cputime = this_cpu_read(s390_cputime.hardirq);
>     delta_nsec = cputime_to_nsecs(delta_cputime, &rem);
>     account_system_time(delta_nsec, HARDIRQ_OFFSET);
>     this_cpu_write(s390_cputime.hardirq, rem);
> 
> Although I doubt that remainders below one nsec lost each tick matter that 
> much.
> But if it does, it's fairly possible to handle like above.
 
To make that work we would have to move some of the logic from 
account_system_time
to the architecture code. The decision if a system time delta is guest time,
irq time, softirq time or simply system time is currently done in 
kernel/sched/cputime.c.

As the conversion + the accounting is delayed to a regular tick we would have
to split the accounting code into decision functions which bucket a system time
delta should go to and introduce new function to account to the different 
buckets.

Instead of a single account_system_time we would have account_guest_time,
account_system_time, account_system_time_irq and account_system_time_softirq.

In principle not a bad idea, that would make the interrupt path for s390 faster
as we would not have to call account_system_time, only the decision function
which could be an inline function.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to