Re: [PATCH 06/27] nohz: Basic full dynticks interface

2013-01-04 Thread Frederic Weisbecker
2012/12/31 Li Zhong :
> On Sat, 2012-12-29 at 17:42 +0100, Frederic Weisbecker wrote:
>> Start with a very simple interface to define full dynticks CPU:
>> use a boot time option defined cpumask through the "full_nohz="
>> kernel parameter.
>>
>> Make sure you keep at least one CPU outside this range to handle
>> the timekeeping.
>>
>> Also full_nohz= must match rcu_nocb= value.
>>
>> Suggested-by: Paul E. McKenney 
>> Signed-off-by: Frederic Weisbecker 
>> Cc: Alessio Igor Bogani 
>> Cc: Andrew Morton 
>> Cc: Chris Metcalf 
>> Cc: Christoph Lameter 
>> Cc: Geoff Levand 
>> Cc: Gilad Ben Yossef 
>> Cc: Hakan Akkan 
>> Cc: Ingo Molnar 
>> Cc: Paul E. McKenney 
>> Cc: Paul Gortmaker 
>> Cc: Peter Zijlstra 
>> Cc: Steven Rostedt 
>> Cc: Thomas Gleixner 
>> ---
>>  include/linux/tick.h |7 +++
>>  kernel/time/Kconfig  |9 +
>>  kernel/time/tick-sched.c |   23 +++
>>  3 files changed, 39 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/tick.h b/include/linux/tick.h
>> index 553272e..2d4f6f0 100644
>> --- a/include/linux/tick.h
>> +++ b/include/linux/tick.h
>> @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 
>> *unused) { return -1; }
>>  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; 
>> }
>>  # endif /* !NO_HZ */
>>
>> +#ifdef CONFIG_NO_HZ_FULL
>> +int tick_nohz_full_cpu(int cpu);
>> +#else
>> +static inline int tick_nohz_full_cpu(int cpu) { return 0; }
>> +#endif
>> +
>> +
>>  # ifdef CONFIG_CPU_IDLE_GOV_MENU
>>  extern void menu_hrtimer_cancel(void);
>>  # else
>> diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
>> index 8601f0d..dc6381d 100644
>> --- a/kernel/time/Kconfig
>> +++ b/kernel/time/Kconfig
>> @@ -70,6 +70,15 @@ config NO_HZ
>> only trigger on an as-needed basis both when the system is
>> busy and when the system is idle.
>>
>> +config NO_HZ_FULL
>> +   bool "Full tickless system"
>> +   depends on NO_HZ && RCU_USER_QS && VIRT_CPU_ACCOUNTING_GEN && 
>> RCU_NOCB_CPU && SMP
>
> Does that mean for archs like PPC64, which HAVE_VIRT_CPU_ACCOUNTING, to
> get NO_HZ_FULL supported, we need to use VIRT_CPU_ACCOUTING_GEN instead
> of VIRT_CPU_ACCOUNTING_NATIVE? ( I think the two, *_NATIVE and *_GEN,
> shouldn't be both enabled at the same time? )

Indeed! This sounds  silly in the first place but _GEN does a context
tracking that _NATIVE doesn't perform. And this context tracking must
also be well ordered and serialized against the cputime snapshots.
This is important when we remotely fix up the time from the read side.
ie: if we read the cputime of a task that runs tickless for some time,
we need to know where it runs (user or kernel) then pick either
tsk->utime or tsk->stime as a result and add to it the delta of time
it has been running tickless.

This fixup is performed in task_cputime() using seqlock() for
ordering/serializing. And the write side use seqlocks too from vtime
accounting APIs. But this is not handled by _NATIVE.

>
> When I tried it on a ppc64 machine, it seems that after I select
> VIRT_CPU_ACCOUNTING, VIRT_CPU_ACCOUNTING_NATIVE is automatically
> selected. And I have no way to enable VIRT_CPU_ACCOUTING_GEN, or disable
> VIRT_CPU_ACCOUNTING_NATIVE. It seems that's because these two don't have
> a configuration name (input prompt).

Yeah I need to fix that. The user should be able to choose between
VIRT_CPU_ACCOUTING_GEN and VIRT_CPU_ACCOUNTING_NATIVE.

I'll fix that for the next release.

>
>> +   select CONTEXT_TRACKING_FORCE
>> +   help
>> + Try to be tickless everywhere, not just in idle. (You need
>> +  to fill up the full_nohz_mask boot parameter).
>
> Maybe it is better to use the name of the boot parameter full_nohz here
> than the name of the mask variable used in the code?
>

Right!

Thanks for your reviews!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/27] nohz: Basic full dynticks interface

2013-01-04 Thread Frederic Weisbecker
2012/12/31 Li Zhong zh...@linux.vnet.ibm.com:
 On Sat, 2012-12-29 at 17:42 +0100, Frederic Weisbecker wrote:
 Start with a very simple interface to define full dynticks CPU:
 use a boot time option defined cpumask through the full_nohz=
 kernel parameter.

 Make sure you keep at least one CPU outside this range to handle
 the timekeeping.

 Also full_nohz= must match rcu_nocb= value.

 Suggested-by: Paul E. McKenney paul...@linux.vnet.ibm.com
 Signed-off-by: Frederic Weisbecker fweis...@gmail.com
 Cc: Alessio Igor Bogani abog...@kernel.org
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Chris Metcalf cmetc...@tilera.com
 Cc: Christoph Lameter c...@linux.com
 Cc: Geoff Levand ge...@infradead.org
 Cc: Gilad Ben Yossef gi...@benyossef.com
 Cc: Hakan Akkan hakanak...@gmail.com
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
 Cc: Paul Gortmaker paul.gortma...@windriver.com
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Thomas Gleixner t...@linutronix.de
 ---
  include/linux/tick.h |7 +++
  kernel/time/Kconfig  |9 +
  kernel/time/tick-sched.c |   23 +++
  3 files changed, 39 insertions(+), 0 deletions(-)

 diff --git a/include/linux/tick.h b/include/linux/tick.h
 index 553272e..2d4f6f0 100644
 --- a/include/linux/tick.h
 +++ b/include/linux/tick.h
 @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 
 *unused) { return -1; }
  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; 
 }
  # endif /* !NO_HZ */

 +#ifdef CONFIG_NO_HZ_FULL
 +int tick_nohz_full_cpu(int cpu);
 +#else
 +static inline int tick_nohz_full_cpu(int cpu) { return 0; }
 +#endif
 +
 +
  # ifdef CONFIG_CPU_IDLE_GOV_MENU
  extern void menu_hrtimer_cancel(void);
  # else
 diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
 index 8601f0d..dc6381d 100644
 --- a/kernel/time/Kconfig
 +++ b/kernel/time/Kconfig
 @@ -70,6 +70,15 @@ config NO_HZ
 only trigger on an as-needed basis both when the system is
 busy and when the system is idle.

 +config NO_HZ_FULL
 +   bool Full tickless system
 +   depends on NO_HZ  RCU_USER_QS  VIRT_CPU_ACCOUNTING_GEN  
 RCU_NOCB_CPU  SMP

 Does that mean for archs like PPC64, which HAVE_VIRT_CPU_ACCOUNTING, to
 get NO_HZ_FULL supported, we need to use VIRT_CPU_ACCOUTING_GEN instead
 of VIRT_CPU_ACCOUNTING_NATIVE? ( I think the two, *_NATIVE and *_GEN,
 shouldn't be both enabled at the same time? )

Indeed! This sounds  silly in the first place but _GEN does a context
tracking that _NATIVE doesn't perform. And this context tracking must
also be well ordered and serialized against the cputime snapshots.
This is important when we remotely fix up the time from the read side.
ie: if we read the cputime of a task that runs tickless for some time,
we need to know where it runs (user or kernel) then pick either
tsk-utime or tsk-stime as a result and add to it the delta of time
it has been running tickless.

This fixup is performed in task_cputime() using seqlock() for
ordering/serializing. And the write side use seqlocks too from vtime
accounting APIs. But this is not handled by _NATIVE.


 When I tried it on a ppc64 machine, it seems that after I select
 VIRT_CPU_ACCOUNTING, VIRT_CPU_ACCOUNTING_NATIVE is automatically
 selected. And I have no way to enable VIRT_CPU_ACCOUTING_GEN, or disable
 VIRT_CPU_ACCOUNTING_NATIVE. It seems that's because these two don't have
 a configuration name (input prompt).

Yeah I need to fix that. The user should be able to choose between
VIRT_CPU_ACCOUTING_GEN and VIRT_CPU_ACCOUNTING_NATIVE.

I'll fix that for the next release.


 +   select CONTEXT_TRACKING_FORCE
 +   help
 + Try to be tickless everywhere, not just in idle. (You need
 +  to fill up the full_nohz_mask boot parameter).

 Maybe it is better to use the name of the boot parameter full_nohz here
 than the name of the mask variable used in the code?


Right!

Thanks for your reviews!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/27] nohz: Basic full dynticks interface

2012-12-30 Thread Li Zhong
On Sat, 2012-12-29 at 17:42 +0100, Frederic Weisbecker wrote:
> Start with a very simple interface to define full dynticks CPU:
> use a boot time option defined cpumask through the "full_nohz="
> kernel parameter.
> 
> Make sure you keep at least one CPU outside this range to handle
> the timekeeping.
> 
> Also full_nohz= must match rcu_nocb= value.
> 
> Suggested-by: Paul E. McKenney 
> Signed-off-by: Frederic Weisbecker 
> Cc: Alessio Igor Bogani 
> Cc: Andrew Morton 
> Cc: Chris Metcalf 
> Cc: Christoph Lameter 
> Cc: Geoff Levand 
> Cc: Gilad Ben Yossef 
> Cc: Hakan Akkan 
> Cc: Ingo Molnar 
> Cc: Paul E. McKenney 
> Cc: Paul Gortmaker 
> Cc: Peter Zijlstra 
> Cc: Steven Rostedt 
> Cc: Thomas Gleixner 
> ---
>  include/linux/tick.h |7 +++
>  kernel/time/Kconfig  |9 +
>  kernel/time/tick-sched.c |   23 +++
>  3 files changed, 39 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 553272e..2d4f6f0 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 
> *unused) { return -1; }
>  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
>  # endif /* !NO_HZ */
> 
> +#ifdef CONFIG_NO_HZ_FULL
> +int tick_nohz_full_cpu(int cpu);
> +#else
> +static inline int tick_nohz_full_cpu(int cpu) { return 0; }
> +#endif
> +
> +
>  # ifdef CONFIG_CPU_IDLE_GOV_MENU
>  extern void menu_hrtimer_cancel(void);
>  # else
> diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
> index 8601f0d..dc6381d 100644
> --- a/kernel/time/Kconfig
> +++ b/kernel/time/Kconfig
> @@ -70,6 +70,15 @@ config NO_HZ
> only trigger on an as-needed basis both when the system is
> busy and when the system is idle.
> 
> +config NO_HZ_FULL
> +   bool "Full tickless system"
> +   depends on NO_HZ && RCU_USER_QS && VIRT_CPU_ACCOUNTING_GEN && 
> RCU_NOCB_CPU && SMP

Does that mean for archs like PPC64, which HAVE_VIRT_CPU_ACCOUNTING, to
get NO_HZ_FULL supported, we need to use VIRT_CPU_ACCOUTING_GEN instead
of VIRT_CPU_ACCOUNTING_NATIVE? ( I think the two, *_NATIVE and *_GEN,
shouldn't be both enabled at the same time? )

When I tried it on a ppc64 machine, it seems that after I select
VIRT_CPU_ACCOUNTING, VIRT_CPU_ACCOUNTING_NATIVE is automatically
selected. And I have no way to enable VIRT_CPU_ACCOUTING_GEN, or disable
VIRT_CPU_ACCOUNTING_NATIVE. It seems that's because these two don't have
a configuration name (input prompt).

> +   select CONTEXT_TRACKING_FORCE
> +   help
> + Try to be tickless everywhere, not just in idle. (You need
> +  to fill up the full_nohz_mask boot parameter).

Maybe it is better to use the name of the boot parameter full_nohz here
than the name of the mask variable used in the code? 

Thanks, Zhong

> +
> +
>  config HIGH_RES_TIMERS
>   bool "High Resolution Timer Support"
>   depends on !ARCH_USES_GETTIMEOFFSET && GENERIC_CLOCKEVENTS
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index ad0e6fa..fac9ba4 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -142,6 +142,29 @@ static void tick_sched_handle(struct tick_sched *ts, 
> struct pt_regs *regs)
>   profile_tick(CPU_PROFILING);
>  }
> 
> +#ifdef CONFIG_NO_HZ_FULL
> +static cpumask_var_t full_nohz_mask;
> +bool have_full_nohz_mask;
> +
> +int tick_nohz_full_cpu(int cpu)
> +{
> + if (!have_full_nohz_mask)
> + return 0;
> +
> + return cpumask_test_cpu(cpu, full_nohz_mask);
> +}
> +
> +/* Parse the boot-time nohz CPU list from the kernel parameters. */
> +static int __init tick_nohz_full_setup(char *str)
> +{
> + alloc_bootmem_cpumask_var(_nohz_mask);
> + have_full_nohz_mask = true;
> + cpulist_parse(str, full_nohz_mask);
> + return 1;
> +}
> +__setup("full_nohz=", tick_nohz_full_setup);
> +#endif
> +
>  /*
>   * NOHZ - aka dynamic tick functionality
>   */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 06/27] nohz: Basic full dynticks interface

2012-12-30 Thread Li Zhong
On Sat, 2012-12-29 at 17:42 +0100, Frederic Weisbecker wrote:
 Start with a very simple interface to define full dynticks CPU:
 use a boot time option defined cpumask through the full_nohz=
 kernel parameter.
 
 Make sure you keep at least one CPU outside this range to handle
 the timekeeping.
 
 Also full_nohz= must match rcu_nocb= value.
 
 Suggested-by: Paul E. McKenney paul...@linux.vnet.ibm.com
 Signed-off-by: Frederic Weisbecker fweis...@gmail.com
 Cc: Alessio Igor Bogani abog...@kernel.org
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Chris Metcalf cmetc...@tilera.com
 Cc: Christoph Lameter c...@linux.com
 Cc: Geoff Levand ge...@infradead.org
 Cc: Gilad Ben Yossef gi...@benyossef.com
 Cc: Hakan Akkan hakanak...@gmail.com
 Cc: Ingo Molnar mi...@kernel.org
 Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
 Cc: Paul Gortmaker paul.gortma...@windriver.com
 Cc: Peter Zijlstra pet...@infradead.org
 Cc: Steven Rostedt rost...@goodmis.org
 Cc: Thomas Gleixner t...@linutronix.de
 ---
  include/linux/tick.h |7 +++
  kernel/time/Kconfig  |9 +
  kernel/time/tick-sched.c |   23 +++
  3 files changed, 39 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/tick.h b/include/linux/tick.h
 index 553272e..2d4f6f0 100644
 --- a/include/linux/tick.h
 +++ b/include/linux/tick.h
 @@ -157,6 +157,13 @@ static inline u64 get_cpu_idle_time_us(int cpu, u64 
 *unused) { return -1; }
  static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
  # endif /* !NO_HZ */
 
 +#ifdef CONFIG_NO_HZ_FULL
 +int tick_nohz_full_cpu(int cpu);
 +#else
 +static inline int tick_nohz_full_cpu(int cpu) { return 0; }
 +#endif
 +
 +
  # ifdef CONFIG_CPU_IDLE_GOV_MENU
  extern void menu_hrtimer_cancel(void);
  # else
 diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
 index 8601f0d..dc6381d 100644
 --- a/kernel/time/Kconfig
 +++ b/kernel/time/Kconfig
 @@ -70,6 +70,15 @@ config NO_HZ
 only trigger on an as-needed basis both when the system is
 busy and when the system is idle.
 
 +config NO_HZ_FULL
 +   bool Full tickless system
 +   depends on NO_HZ  RCU_USER_QS  VIRT_CPU_ACCOUNTING_GEN  
 RCU_NOCB_CPU  SMP

Does that mean for archs like PPC64, which HAVE_VIRT_CPU_ACCOUNTING, to
get NO_HZ_FULL supported, we need to use VIRT_CPU_ACCOUTING_GEN instead
of VIRT_CPU_ACCOUNTING_NATIVE? ( I think the two, *_NATIVE and *_GEN,
shouldn't be both enabled at the same time? )

When I tried it on a ppc64 machine, it seems that after I select
VIRT_CPU_ACCOUNTING, VIRT_CPU_ACCOUNTING_NATIVE is automatically
selected. And I have no way to enable VIRT_CPU_ACCOUTING_GEN, or disable
VIRT_CPU_ACCOUNTING_NATIVE. It seems that's because these two don't have
a configuration name (input prompt).

 +   select CONTEXT_TRACKING_FORCE
 +   help
 + Try to be tickless everywhere, not just in idle. (You need
 +  to fill up the full_nohz_mask boot parameter).

Maybe it is better to use the name of the boot parameter full_nohz here
than the name of the mask variable used in the code? 

Thanks, Zhong

 +
 +
  config HIGH_RES_TIMERS
   bool High Resolution Timer Support
   depends on !ARCH_USES_GETTIMEOFFSET  GENERIC_CLOCKEVENTS
 diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
 index ad0e6fa..fac9ba4 100644
 --- a/kernel/time/tick-sched.c
 +++ b/kernel/time/tick-sched.c
 @@ -142,6 +142,29 @@ static void tick_sched_handle(struct tick_sched *ts, 
 struct pt_regs *regs)
   profile_tick(CPU_PROFILING);
  }
 
 +#ifdef CONFIG_NO_HZ_FULL
 +static cpumask_var_t full_nohz_mask;
 +bool have_full_nohz_mask;
 +
 +int tick_nohz_full_cpu(int cpu)
 +{
 + if (!have_full_nohz_mask)
 + return 0;
 +
 + return cpumask_test_cpu(cpu, full_nohz_mask);
 +}
 +
 +/* Parse the boot-time nohz CPU list from the kernel parameters. */
 +static int __init tick_nohz_full_setup(char *str)
 +{
 + alloc_bootmem_cpumask_var(full_nohz_mask);
 + have_full_nohz_mask = true;
 + cpulist_parse(str, full_nohz_mask);
 + return 1;
 +}
 +__setup(full_nohz=, tick_nohz_full_setup);
 +#endif
 +
  /*
   * NOHZ - aka dynamic tick functionality
   */


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/