Re: [PATCH 06/27] nohz: Basic full dynticks interface
2012/12/31 Li Zhong : > On Sat, 2012-12-29 at 17:42 +0100, Frederic Weisbecker wrote: >> Start with a very simple interface to define full dynticks CPU: >> use a boot time option defined cpumask through the "full_nohz=" >> kernel parameter. >> >> Make sure you keep at least one CPU outside this range to handle >> the timekeeping. >> >> Also full_nohz= must match rcu_nocb= value. >> >> Suggested-by: Paul E. McKenney >> Signed-off-by: Frederic Weisbecker >> Cc: Alessio Igor Bogani >> Cc: Andrew Morton >> Cc: Chris Metcalf >> Cc: Christoph Lameter >> Cc: Geoff Levand >> Cc: Gilad Ben Yossef >> Cc: Hakan Akkan >> Cc: Ingo Molnar >> Cc: Paul E. McKenney >> Cc: Paul Gortmaker >> Cc: Peter Zijlstra >> Cc: Steven Rostedt >> Cc: Thomas Gleixner >> --- >> include/linux/tick.h |7 +++ >> kernel/time/Kconfig |9 + >> kernel/time/tick-sched.c | 23 +++ >> 3 files changed, 39 insertions(+), 0 deletions(-) >> >> diff --git a/include/linux/tick.h b/include/linux/tick.h >> index 553272e..2d4f6f0 100644 >> --- a/include/linux/tick.h >> +++ b/include/linux/tick.h >> @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 >> *unused) { return -1; } >> static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; >> } >> # endif /* !NO_HZ */ >> >> +#ifdef CONFIG_NO_HZ_FULL >> +int tick_nohz_full_cpu(int cpu); >> +#else >> +static inline int tick_nohz_full_cpu(int cpu) { return 0; } >> +#endif >> + >> + >> # ifdef CONFIG_CPU_IDLE_GOV_MENU >> extern void menu_hrtimer_cancel(void); >> # else >> diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig >> index 8601f0d..dc6381d 100644 >> --- a/kernel/time/Kconfig >> +++ b/kernel/time/Kconfig >> @@ -70,6 +70,15 @@ config NO_HZ >> only trigger on an as-needed basis both when the system is >> busy and when the system is idle. >> >> +config NO_HZ_FULL >> + bool "Full tickless system" >> + depends on NO_HZ && RCU_USER_QS && VIRT_CPU_ACCOUNTING_GEN && >> RCU_NOCB_CPU && SMP > > Does that mean for archs like PPC64, which HAVE_VIRT_CPU_ACCOUNTING, to > get NO_HZ_FULL supported, we need to use VIRT_CPU_ACCOUTING_GEN instead > of VIRT_CPU_ACCOUNTING_NATIVE? ( I think the two, *_NATIVE and *_GEN, > shouldn't be both enabled at the same time? ) Indeed! This sounds silly in the first place but _GEN does a context tracking that _NATIVE doesn't perform. And this context tracking must also be well ordered and serialized against the cputime snapshots. This is important when we remotely fix up the time from the read side. ie: if we read the cputime of a task that runs tickless for some time, we need to know where it runs (user or kernel) then pick either tsk->utime or tsk->stime as a result and add to it the delta of time it has been running tickless. This fixup is performed in task_cputime() using seqlock() for ordering/serializing. And the write side use seqlocks too from vtime accounting APIs. But this is not handled by _NATIVE. > > When I tried it on a ppc64 machine, it seems that after I select > VIRT_CPU_ACCOUNTING, VIRT_CPU_ACCOUNTING_NATIVE is automatically > selected. And I have no way to enable VIRT_CPU_ACCOUTING_GEN, or disable > VIRT_CPU_ACCOUNTING_NATIVE. It seems that's because these two don't have > a configuration name (input prompt). Yeah I need to fix that. The user should be able to choose between VIRT_CPU_ACCOUTING_GEN and VIRT_CPU_ACCOUNTING_NATIVE. I'll fix that for the next release. > >> + select CONTEXT_TRACKING_FORCE >> + help >> + Try to be tickless everywhere, not just in idle. (You need >> + to fill up the full_nohz_mask boot parameter). > > Maybe it is better to use the name of the boot parameter full_nohz here > than the name of the mask variable used in the code? > Right! Thanks for your reviews! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 06/27] nohz: Basic full dynticks interface
On Sat, 2012-12-29 at 17:42 +0100, Frederic Weisbecker wrote: > Start with a very simple interface to define full dynticks CPU: > use a boot time option defined cpumask through the "full_nohz=" > kernel parameter. > > Make sure you keep at least one CPU outside this range to handle > the timekeeping. > > Also full_nohz= must match rcu_nocb= value. > > Suggested-by: Paul E. McKenney > Signed-off-by: Frederic Weisbecker > Cc: Alessio Igor Bogani > Cc: Andrew Morton > Cc: Chris Metcalf > Cc: Christoph Lameter > Cc: Geoff Levand > Cc: Gilad Ben Yossef > Cc: Hakan Akkan > Cc: Ingo Molnar > Cc: Paul E. McKenney > Cc: Paul Gortmaker > Cc: Peter Zijlstra > Cc: Steven Rostedt > Cc: Thomas Gleixner > --- > include/linux/tick.h |7 +++ > kernel/time/Kconfig |9 + > kernel/time/tick-sched.c | 23 +++ > 3 files changed, 39 insertions(+), 0 deletions(-) > > diff --git a/include/linux/tick.h b/include/linux/tick.h > index 553272e..2d4f6f0 100644 > --- a/include/linux/tick.h > +++ b/include/linux/tick.h > @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 > *unused) { return -1; } > static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; } > # endif /* !NO_HZ */ > > +#ifdef CONFIG_NO_HZ_FULL > +int tick_nohz_full_cpu(int cpu); > +#else > +static inline int tick_nohz_full_cpu(int cpu) { return 0; } > +#endif > + > + > # ifdef CONFIG_CPU_IDLE_GOV_MENU > extern void menu_hrtimer_cancel(void); > # else > diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig > index 8601f0d..dc6381d 100644 > --- a/kernel/time/Kconfig > +++ b/kernel/time/Kconfig > @@ -70,6 +70,15 @@ config NO_HZ > only trigger on an as-needed basis both when the system is > busy and when the system is idle. > > +config NO_HZ_FULL > + bool "Full tickless system" > + depends on NO_HZ && RCU_USER_QS && VIRT_CPU_ACCOUNTING_GEN && > RCU_NOCB_CPU && SMP Does that mean for archs like PPC64, which HAVE_VIRT_CPU_ACCOUNTING, to get NO_HZ_FULL supported, we need to use VIRT_CPU_ACCOUTING_GEN instead of VIRT_CPU_ACCOUNTING_NATIVE? ( I think the two, *_NATIVE and *_GEN, shouldn't be both enabled at the same time? ) When I tried it on a ppc64 machine, it seems that after I select VIRT_CPU_ACCOUNTING, VIRT_CPU_ACCOUNTING_NATIVE is automatically selected. And I have no way to enable VIRT_CPU_ACCOUTING_GEN, or disable VIRT_CPU_ACCOUNTING_NATIVE. It seems that's because these two don't have a configuration name (input prompt). > + select CONTEXT_TRACKING_FORCE > + help > + Try to be tickless everywhere, not just in idle. (You need > + to fill up the full_nohz_mask boot parameter). Maybe it is better to use the name of the boot parameter full_nohz here than the name of the mask variable used in the code? Thanks, Zhong > + > + > config HIGH_RES_TIMERS > bool "High Resolution Timer Support" > depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index ad0e6fa..fac9ba4 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -142,6 +142,29 @@ static void tick_sched_handle(struct tick_sched *ts, > struct pt_regs *regs) > profile_tick(CPU_PROFILING); > } > > +#ifdef CONFIG_NO_HZ_FULL > +static cpumask_var_t full_nohz_mask; > +bool have_full_nohz_mask; > + > +int tick_nohz_full_cpu(int cpu) > +{ > + if (!have_full_nohz_mask) > + return 0; > + > + return cpumask_test_cpu(cpu, full_nohz_mask); > +} > + > +/* Parse the boot-time nohz CPU list from the kernel parameters. */ > +static int __init tick_nohz_full_setup(char *str) > +{ > + alloc_bootmem_cpumask_var(&full_nohz_mask); > + have_full_nohz_mask = true; > + cpulist_parse(str, full_nohz_mask); > + return 1; > +} > +__setup("full_nohz=", tick_nohz_full_setup); > +#endif > + > /* > * NOHZ - aka dynamic tick functionality > */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/27] nohz: Basic full dynticks interface
Start with a very simple interface to define full dynticks CPU: use a boot time option defined cpumask through the "full_nohz=" kernel parameter. Make sure you keep at least one CPU outside this range to handle the timekeeping. Also full_nohz= must match rcu_nocb= value. Suggested-by: Paul E. McKenney Signed-off-by: Frederic Weisbecker Cc: Alessio Igor Bogani Cc: Andrew Morton Cc: Chris Metcalf Cc: Christoph Lameter Cc: Geoff Levand Cc: Gilad Ben Yossef Cc: Hakan Akkan Cc: Ingo Molnar Cc: Paul E. McKenney Cc: Paul Gortmaker Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner --- include/linux/tick.h |7 +++ kernel/time/Kconfig |9 + kernel/time/tick-sched.c | 23 +++ 3 files changed, 39 insertions(+), 0 deletions(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 553272e..2d4f6f0 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 *unused) { return -1; } static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; } # endif /* !NO_HZ */ +#ifdef CONFIG_NO_HZ_FULL +int tick_nohz_full_cpu(int cpu); +#else +static inline int tick_nohz_full_cpu(int cpu) { return 0; } +#endif + + # ifdef CONFIG_CPU_IDLE_GOV_MENU extern void menu_hrtimer_cancel(void); # else diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index 8601f0d..dc6381d 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -70,6 +70,15 @@ config NO_HZ only trigger on an as-needed basis both when the system is busy and when the system is idle. +config NO_HZ_FULL + bool "Full tickless system" + depends on NO_HZ && RCU_USER_QS && VIRT_CPU_ACCOUNTING_GEN && RCU_NOCB_CPU && SMP + select CONTEXT_TRACKING_FORCE + help + Try to be tickless everywhere, not just in idle. (You need +to fill up the full_nohz_mask boot parameter). + + config HIGH_RES_TIMERS bool "High Resolution Timer Support" depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index ad0e6fa..fac9ba4 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -142,6 +142,29 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs) profile_tick(CPU_PROFILING); } +#ifdef CONFIG_NO_HZ_FULL +static cpumask_var_t full_nohz_mask; +bool have_full_nohz_mask; + +int tick_nohz_full_cpu(int cpu) +{ + if (!have_full_nohz_mask) + return 0; + + return cpumask_test_cpu(cpu, full_nohz_mask); +} + +/* Parse the boot-time nohz CPU list from the kernel parameters. */ +static int __init tick_nohz_full_setup(char *str) +{ + alloc_bootmem_cpumask_var(&full_nohz_mask); + have_full_nohz_mask = true; + cpulist_parse(str, full_nohz_mask); + return 1; +} +__setup("full_nohz=", tick_nohz_full_setup); +#endif + /* * NOHZ - aka dynamic tick functionality */ -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/