[RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-09 Thread Nitesh Narayan Lal
Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
the number of housekeeping CPUs by reading an atomic variable
__num_housekeeping_cpus. This variable is set from housekeeping_setup().

This API is introduced for the purpose of drivers that were previously
relying only on num_online_cpus() to determine the number of MSIX vectors
to create. In an RT environment with large isolated but a fewer
housekeeping CPUs this was leading to a situation where an attempt to
move all of the vectors corresponding to isolated CPUs to housekeeping
CPUs was failing due to per CPU vector limit.

If there are no isolated CPUs specified then the API returns the number
of all online CPUs.

Signed-off-by: Nitesh Narayan Lal 
---
 include/linux/sched/isolation.h |  7 +++
 kernel/sched/isolation.c| 23 +++
 2 files changed, 30 insertions(+)

diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
index cc9f393e2a70..94c25d956d8a 100644
--- a/include/linux/sched/isolation.h
+++ b/include/linux/sched/isolation.h
@@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
 extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
 extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
 extern void __init housekeeping_init(void);
+extern unsigned int num_housekeeping_cpus(void);
 
 #else
 
@@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags flags)
 static inline void housekeeping_affine(struct task_struct *t,
   enum hk_flags flags) { }
 static inline void housekeeping_init(void) { }
+
+static unsigned int num_housekeeping_cpus(void)
+{
+   return num_online_cpus();
+}
+
 #endif /* CONFIG_CPU_ISOLATION */
 
 static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 5a6ea03f9882..7024298390b7 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
 EXPORT_SYMBOL_GPL(housekeeping_overridden);
 static cpumask_var_t housekeeping_mask;
 static unsigned int housekeeping_flags;
+static atomic_t __num_housekeeping_cpus __read_mostly;
 
 bool housekeeping_enabled(enum hk_flags flags)
 {
@@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
 }
 EXPORT_SYMBOL_GPL(housekeeping_enabled);
 
+/*
+ * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
+ *
+ * This function returns the number of available housekeeping CPUs
+ * based on __num_housekeeping_cpus which is of type atomic_t
+ * and is initialized at the time of the housekeeping setup.
+ */
+unsigned int num_housekeeping_cpus(void)
+{
+   unsigned int cpus;
+
+   if (static_branch_unlikely(&housekeeping_overridden)) {
+   cpus = atomic_read(&__num_housekeeping_cpus);
+   /* We should always have at least one housekeeping CPU */
+   BUG_ON(!cpus);
+   return cpus;
+   }
+   return num_online_cpus();
+}
+EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
+
 int housekeeping_any_cpu(enum hk_flags flags)
 {
int cpu;
@@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
hk_flags flags)
 
housekeeping_flags |= flags;
 
+   atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));
free_bootmem_cpumask_var(non_housekeeping_mask);
 
return 1;
-- 
2.27.0



Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-17 Thread Jesse Brandeburg
Nitesh Narayan Lal wrote:

> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
> the number of housekeeping CPUs by reading an atomic variable
> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
> 
> This API is introduced for the purpose of drivers that were previously
> relying only on num_online_cpus() to determine the number of MSIX vectors
> to create. In an RT environment with large isolated but a fewer
> housekeeping CPUs this was leading to a situation where an attempt to
> move all of the vectors corresponding to isolated CPUs to housekeeping
> CPUs was failing due to per CPU vector limit.
> 
> If there are no isolated CPUs specified then the API returns the number
> of all online CPUs.
> 
> Signed-off-by: Nitesh Narayan Lal 
> ---
>  include/linux/sched/isolation.h |  7 +++
>  kernel/sched/isolation.c| 23 +++
>  2 files changed, 30 insertions(+)

I'm not a scheduler expert, but a couple comments follow.

> 
> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
> index cc9f393e2a70..94c25d956d8a 100644
> --- a/include/linux/sched/isolation.h
> +++ b/include/linux/sched/isolation.h
> @@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
>  extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
>  extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
>  extern void __init housekeeping_init(void);
> +extern unsigned int num_housekeeping_cpus(void);
>  
>  #else
>  
> @@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags 
> flags)
>  static inline void housekeeping_affine(struct task_struct *t,
>  enum hk_flags flags) { }
>  static inline void housekeeping_init(void) { }
> +
> +static unsigned int num_housekeeping_cpus(void)
> +{
> + return num_online_cpus();
> +}
> +
>  #endif /* CONFIG_CPU_ISOLATION */
>  
>  static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index 5a6ea03f9882..7024298390b7 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
>  EXPORT_SYMBOL_GPL(housekeeping_overridden);
>  static cpumask_var_t housekeeping_mask;
>  static unsigned int housekeeping_flags;
> +static atomic_t __num_housekeeping_cpus __read_mostly;
>  
>  bool housekeeping_enabled(enum hk_flags flags)
>  {
> @@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
>  }
>  EXPORT_SYMBOL_GPL(housekeeping_enabled);
>  
> +/*

use correct kdoc style, and you get free documentation from your source
(you're so close!)

should be (note the first line and the function title line change to
remove parens:
/**
 * num_housekeeping_cpus - Read the number of housekeeping CPUs.
 *
 * This function returns the number of available housekeeping CPUs
 * based on __num_housekeeping_cpus which is of type atomic_t
 * and is initialized at the time of the housekeeping setup.
 */

> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
> + *
> + * This function returns the number of available housekeeping CPUs
> + * based on __num_housekeeping_cpus which is of type atomic_t
> + * and is initialized at the time of the housekeeping setup.
> + */
> +unsigned int num_housekeeping_cpus(void)
> +{
> + unsigned int cpus;
> +
> + if (static_branch_unlikely(&housekeeping_overridden)) {
> + cpus = atomic_read(&__num_housekeeping_cpus);
> + /* We should always have at least one housekeeping CPU */
> + BUG_ON(!cpus);

you need to crash the kernel because of this? maybe a WARN_ON? How did
the global even get set to the bad value? It's going to blame the poor
caller for this in the trace, but the caller likely had nothing to do
with setting the value incorrectly!

> + return cpus;
> + }
> + return num_online_cpus();
> +}
> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);



Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-17 Thread Nitesh Narayan Lal

On 9/17/20 2:18 PM, Jesse Brandeburg wrote:
> Nitesh Narayan Lal wrote:
>
>> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
>> the number of housekeeping CPUs by reading an atomic variable
>> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
>>
>> This API is introduced for the purpose of drivers that were previously
>> relying only on num_online_cpus() to determine the number of MSIX vectors
>> to create. In an RT environment with large isolated but a fewer
>> housekeeping CPUs this was leading to a situation where an attempt to
>> move all of the vectors corresponding to isolated CPUs to housekeeping
>> CPUs was failing due to per CPU vector limit.
>>
>> If there are no isolated CPUs specified then the API returns the number
>> of all online CPUs.
>>
>> Signed-off-by: Nitesh Narayan Lal 
>> ---
>>  include/linux/sched/isolation.h |  7 +++
>>  kernel/sched/isolation.c| 23 +++
>>  2 files changed, 30 insertions(+)
> I'm not a scheduler expert, but a couple comments follow.
>
>> diff --git a/include/linux/sched/isolation.h 
>> b/include/linux/sched/isolation.h
>> index cc9f393e2a70..94c25d956d8a 100644
>> --- a/include/linux/sched/isolation.h
>> +++ b/include/linux/sched/isolation.h
>> @@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
>>  extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
>>  extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
>>  extern void __init housekeeping_init(void);
>> +extern unsigned int num_housekeeping_cpus(void);
>>  
>>  #else
>>  
>> @@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags 
>> flags)
>>  static inline void housekeeping_affine(struct task_struct *t,
>> enum hk_flags flags) { }
>>  static inline void housekeeping_init(void) { }
>> +
>> +static unsigned int num_housekeeping_cpus(void)
>> +{
>> +return num_online_cpus();
>> +}
>> +
>>  #endif /* CONFIG_CPU_ISOLATION */
>>  
>>  static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>> index 5a6ea03f9882..7024298390b7 100644
>> --- a/kernel/sched/isolation.c
>> +++ b/kernel/sched/isolation.c
>> @@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
>>  EXPORT_SYMBOL_GPL(housekeeping_overridden);
>>  static cpumask_var_t housekeeping_mask;
>>  static unsigned int housekeeping_flags;
>> +static atomic_t __num_housekeeping_cpus __read_mostly;
>>  
>>  bool housekeeping_enabled(enum hk_flags flags)
>>  {
>> @@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
>>  }
>>  EXPORT_SYMBOL_GPL(housekeeping_enabled);
>>  
>> +/*
> use correct kdoc style, and you get free documentation from your source
> (you're so close!)
>
> should be (note the first line and the function title line change to
> remove parens:
> /**
>  * num_housekeeping_cpus - Read the number of housekeeping CPUs.
>  *
>  * This function returns the number of available housekeeping CPUs
>  * based on __num_housekeeping_cpus which is of type atomic_t
>  * and is initialized at the time of the housekeeping setup.
>  */

My bad, I missed that.
Thanks for pointing it out.

>
>> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
>> + *
>> + * This function returns the number of available housekeeping CPUs
>> + * based on __num_housekeeping_cpus which is of type atomic_t
>> + * and is initialized at the time of the housekeeping setup.
>> + */
>> +unsigned int num_housekeeping_cpus(void)
>> +{
>> +unsigned int cpus;
>> +
>> +if (static_branch_unlikely(&housekeeping_overridden)) {
>> +cpus = atomic_read(&__num_housekeeping_cpus);
>> +/* We should always have at least one housekeeping CPU */
>> +BUG_ON(!cpus);
> you need to crash the kernel because of this? maybe a WARN_ON? How did
> the global even get set to the bad value? It's going to blame the poor
> caller for this in the trace, but the caller likely had nothing to do
> with setting the value incorrectly!

Yes, ideally this should not be triggered, but if somehow it does then we have
a bug and that needs to be fixed. That's probably the only reason why I chose
BUG_ON.
But, I am not entirely against the usage of WARN_ON either, because we get a
stack trace anyways.
I will see if anyone else has any other concerns on this patch and then I can
post the next version.

>
>> +return cpus;
>> +}
>> +return num_online_cpus();
>> +}
>> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
-- 
Thanks
Nitesh



signature.asc
Description: OpenPGP digital signature


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-17 Thread Bjorn Helgaas
[+cc Ingo, Peter, Juri, Vincent (scheduler maintainers)]

s/hosekeeping/housekeeping/ (in subject)

On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
> the number of housekeeping CPUs by reading an atomic variable
> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
> 
> This API is introduced for the purpose of drivers that were previously
> relying only on num_online_cpus() to determine the number of MSIX vectors
> to create. In an RT environment with large isolated but a fewer
> housekeeping CPUs this was leading to a situation where an attempt to
> move all of the vectors corresponding to isolated CPUs to housekeeping
> CPUs was failing due to per CPU vector limit.

Totally kibitzing here, but AFAICT the concepts of "isolated CPU" and
"housekeeping CPU" are not currently exposed to drivers, and it's not
completely clear to me that they should be.

We have carefully constructed notions of possible, present, online,
active CPUs, and it seems like whatever we do here should be
somehow integrated with those.

> If there are no isolated CPUs specified then the API returns the number
> of all online CPUs.
> 
> Signed-off-by: Nitesh Narayan Lal 
> ---
>  include/linux/sched/isolation.h |  7 +++
>  kernel/sched/isolation.c| 23 +++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
> index cc9f393e2a70..94c25d956d8a 100644
> --- a/include/linux/sched/isolation.h
> +++ b/include/linux/sched/isolation.h
> @@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
>  extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
>  extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
>  extern void __init housekeeping_init(void);
> +extern unsigned int num_housekeeping_cpus(void);
>  
>  #else
>  
> @@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags 
> flags)
>  static inline void housekeeping_affine(struct task_struct *t,
>  enum hk_flags flags) { }
>  static inline void housekeeping_init(void) { }
> +
> +static unsigned int num_housekeeping_cpus(void)
> +{
> + return num_online_cpus();
> +}
> +
>  #endif /* CONFIG_CPU_ISOLATION */
>  
>  static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index 5a6ea03f9882..7024298390b7 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
>  EXPORT_SYMBOL_GPL(housekeeping_overridden);
>  static cpumask_var_t housekeeping_mask;
>  static unsigned int housekeeping_flags;
> +static atomic_t __num_housekeeping_cpus __read_mostly;
>  
>  bool housekeeping_enabled(enum hk_flags flags)
>  {
> @@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
>  }
>  EXPORT_SYMBOL_GPL(housekeeping_enabled);
>  
> +/*
> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
> + *
> + * This function returns the number of available housekeeping CPUs
> + * based on __num_housekeeping_cpus which is of type atomic_t
> + * and is initialized at the time of the housekeeping setup.
> + */
> +unsigned int num_housekeeping_cpus(void)
> +{
> + unsigned int cpus;
> +
> + if (static_branch_unlikely(&housekeeping_overridden)) {
> + cpus = atomic_read(&__num_housekeeping_cpus);
> + /* We should always have at least one housekeeping CPU */
> + BUG_ON(!cpus);
> + return cpus;
> + }
> + return num_online_cpus();
> +}
> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
> +
>  int housekeeping_any_cpu(enum hk_flags flags)
>  {
>   int cpu;
> @@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
> hk_flags flags)
>  
>   housekeeping_flags |= flags;
>  
> + atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));
>   free_bootmem_cpumask_var(non_housekeeping_mask);
>  
>   return 1;
> -- 
> 2.27.0
> 


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-17 Thread Jacob Keller



On 9/17/2020 1:11 PM, Bjorn Helgaas wrote:
> [+cc Ingo, Peter, Juri, Vincent (scheduler maintainers)]
> 
> s/hosekeeping/housekeeping/ (in subject)
> 
> On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
>> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
>> the number of housekeeping CPUs by reading an atomic variable
>> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
>>
>> This API is introduced for the purpose of drivers that were previously
>> relying only on num_online_cpus() to determine the number of MSIX vectors
>> to create. In an RT environment with large isolated but a fewer
>> housekeeping CPUs this was leading to a situation where an attempt to
>> move all of the vectors corresponding to isolated CPUs to housekeeping
>> CPUs was failing due to per CPU vector limit.
> 
> Totally kibitzing here, but AFAICT the concepts of "isolated CPU" and
> "housekeeping CPU" are not currently exposed to drivers, and it's not
> completely clear to me that they should be.
> 
> We have carefully constructed notions of possible, present, online,
> active CPUs, and it seems like whatever we do here should be
> somehow integrated with those.
> 

Perhaps "active" CPUs could be separated to not include the isolated CPUs?


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-17 Thread Nitesh Narayan Lal

On 9/17/20 4:11 PM, Bjorn Helgaas wrote:
> [+cc Ingo, Peter, Juri, Vincent (scheduler maintainers)]
>
> s/hosekeeping/housekeeping/ (in subject)
>
> On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
>> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
>> the number of housekeeping CPUs by reading an atomic variable
>> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
>>
>> This API is introduced for the purpose of drivers that were previously
>> relying only on num_online_cpus() to determine the number of MSIX vectors
>> to create. In an RT environment with large isolated but a fewer
>> housekeeping CPUs this was leading to a situation where an attempt to
>> move all of the vectors corresponding to isolated CPUs to housekeeping
>> CPUs was failing due to per CPU vector limit.
> Totally kibitzing here, but AFAICT the concepts of "isolated CPU" and
> "housekeeping CPU" are not currently exposed to drivers, and it's not
> completely clear to me that they should be.
>
> We have carefully constructed notions of possible, present, online,
> active CPUs, and it seems like whatever we do here should be
> somehow integrated with those.

At one point I thought about tweaking num_online_cpus(), but then I quickly
moved away from that just because it is extensively used in the kernel and we
don't have to modify the behavior at all those places.

Thank you for including Peter and Vincent as well.
I would be happy to discuss/explore other options.

>
>> If there are no isolated CPUs specified then the API returns the number
>> of all online CPUs.
>>
>> Signed-off-by: Nitesh Narayan Lal 
>> ---
>>  include/linux/sched/isolation.h |  7 +++
>>  kernel/sched/isolation.c| 23 +++
>>  2 files changed, 30 insertions(+)
>>
>> diff --git a/include/linux/sched/isolation.h 
>> b/include/linux/sched/isolation.h
>> index cc9f393e2a70..94c25d956d8a 100644
>> --- a/include/linux/sched/isolation.h
>> +++ b/include/linux/sched/isolation.h
>> @@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
>>  extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
>>  extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
>>  extern void __init housekeeping_init(void);
>> +extern unsigned int num_housekeeping_cpus(void);
>>  
>>  #else
>>  
>> @@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags 
>> flags)
>>  static inline void housekeeping_affine(struct task_struct *t,
>> enum hk_flags flags) { }
>>  static inline void housekeeping_init(void) { }
>> +
>> +static unsigned int num_housekeeping_cpus(void)
>> +{
>> +return num_online_cpus();
>> +}
>> +
>>  #endif /* CONFIG_CPU_ISOLATION */
>>  
>>  static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>> index 5a6ea03f9882..7024298390b7 100644
>> --- a/kernel/sched/isolation.c
>> +++ b/kernel/sched/isolation.c
>> @@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
>>  EXPORT_SYMBOL_GPL(housekeeping_overridden);
>>  static cpumask_var_t housekeeping_mask;
>>  static unsigned int housekeeping_flags;
>> +static atomic_t __num_housekeeping_cpus __read_mostly;
>>  
>>  bool housekeeping_enabled(enum hk_flags flags)
>>  {
>> @@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
>>  }
>>  EXPORT_SYMBOL_GPL(housekeeping_enabled);
>>  
>> +/*
>> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
>> + *
>> + * This function returns the number of available housekeeping CPUs
>> + * based on __num_housekeeping_cpus which is of type atomic_t
>> + * and is initialized at the time of the housekeeping setup.
>> + */
>> +unsigned int num_housekeeping_cpus(void)
>> +{
>> +unsigned int cpus;
>> +
>> +if (static_branch_unlikely(&housekeeping_overridden)) {
>> +cpus = atomic_read(&__num_housekeeping_cpus);
>> +/* We should always have at least one housekeeping CPU */
>> +BUG_ON(!cpus);
>> +return cpus;
>> +}
>> +return num_online_cpus();
>> +}
>> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
>> +
>>  int housekeeping_any_cpu(enum hk_flags flags)
>>  {
>>  int cpu;
>> @@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
>> hk_flags flags)
>>  
>>  housekeeping_flags |= flags;
>>  
>> +atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));
>>  free_bootmem_cpumask_var(non_housekeeping_mask);
>>  
>>  return 1;
>> -- 
>> 2.27.0
>>
-- 
Nitesh



signature.asc
Description: OpenPGP digital signature


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-21 Thread Frederic Weisbecker
On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
> +/*
> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
> + *
> + * This function returns the number of available housekeeping CPUs
> + * based on __num_housekeeping_cpus which is of type atomic_t
> + * and is initialized at the time of the housekeeping setup.
> + */
> +unsigned int num_housekeeping_cpus(void)
> +{
> + unsigned int cpus;
> +
> + if (static_branch_unlikely(&housekeeping_overridden)) {
> + cpus = atomic_read(&__num_housekeeping_cpus);
> + /* We should always have at least one housekeeping CPU */
> + BUG_ON(!cpus);
> + return cpus;
> + }
> + return num_online_cpus();
> +}
> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
> +
>  int housekeeping_any_cpu(enum hk_flags flags)
>  {
>   int cpu;
> @@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
> hk_flags flags)
>  
>   housekeeping_flags |= flags;
>  
> + atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));

So the problem here is that it takes the whole cpumask weight but you're only
interested in the housekeepers who take the managed irq duties I guess
(HK_FLAG_MANAGED_IRQ ?).

>   free_bootmem_cpumask_var(non_housekeeping_mask);
>  
>   return 1;
> -- 
> 2.27.0
> 


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-21 Thread Nitesh Narayan Lal

On 9/21/20 7:40 PM, Frederic Weisbecker wrote:
> On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
>> +/*
>> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
>> + *
>> + * This function returns the number of available housekeeping CPUs
>> + * based on __num_housekeeping_cpus which is of type atomic_t
>> + * and is initialized at the time of the housekeeping setup.
>> + */
>> +unsigned int num_housekeeping_cpus(void)
>> +{
>> +unsigned int cpus;
>> +
>> +if (static_branch_unlikely(&housekeeping_overridden)) {
>> +cpus = atomic_read(&__num_housekeeping_cpus);
>> +/* We should always have at least one housekeeping CPU */
>> +BUG_ON(!cpus);
>> +return cpus;
>> +}
>> +return num_online_cpus();
>> +}
>> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
>> +
>>  int housekeeping_any_cpu(enum hk_flags flags)
>>  {
>>  int cpu;
>> @@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
>> hk_flags flags)
>>  
>>  housekeeping_flags |= flags;
>>  
>> +atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));
> So the problem here is that it takes the whole cpumask weight but you're only
> interested in the housekeepers who take the managed irq duties I guess
> (HK_FLAG_MANAGED_IRQ ?).

IMHO we should also consider the cases where we only have nohz_full.
Otherwise, we may run into the same situation on those setups, do you agree?

>
>>  free_bootmem_cpumask_var(non_housekeeping_mask);
>>  
>>  return 1;
>> -- 
>> 2.27.0
>>
-- 
Thanks
Nitesh



signature.asc
Description: OpenPGP digital signature


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-22 Thread Frederic Weisbecker
On Mon, Sep 21, 2020 at 11:16:51PM -0400, Nitesh Narayan Lal wrote:
> 
> On 9/21/20 7:40 PM, Frederic Weisbecker wrote:
> > On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
> >> +/*
> >> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
> >> + *
> >> + * This function returns the number of available housekeeping CPUs
> >> + * based on __num_housekeeping_cpus which is of type atomic_t
> >> + * and is initialized at the time of the housekeeping setup.
> >> + */
> >> +unsigned int num_housekeeping_cpus(void)
> >> +{
> >> +  unsigned int cpus;
> >> +
> >> +  if (static_branch_unlikely(&housekeeping_overridden)) {
> >> +  cpus = atomic_read(&__num_housekeeping_cpus);
> >> +  /* We should always have at least one housekeeping CPU */
> >> +  BUG_ON(!cpus);
> >> +  return cpus;
> >> +  }
> >> +  return num_online_cpus();
> >> +}
> >> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
> >> +
> >>  int housekeeping_any_cpu(enum hk_flags flags)
> >>  {
> >>int cpu;
> >> @@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
> >> hk_flags flags)
> >>  
> >>housekeeping_flags |= flags;
> >>  
> >> +  atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));
> > So the problem here is that it takes the whole cpumask weight but you're 
> > only
> > interested in the housekeepers who take the managed irq duties I guess
> > (HK_FLAG_MANAGED_IRQ ?).
> 
> IMHO we should also consider the cases where we only have nohz_full.
> Otherwise, we may run into the same situation on those setups, do you agree?

I guess it's up to the user to gather the tick and managed irq housekeeping
together?

Of course that makes the implementation more complicated. But if this is
called only on drivers initialization for now, this could be just a function
that does:

cpumask_weight(cpu_online_mask | housekeeping_cpumask(HK_FLAG_MANAGED_IRQ))

And then can we rename it to housekeeping_num_online()?

Thanks.

> >
> >>free_bootmem_cpumask_var(non_housekeeping_mask);
> >>  
> >>return 1;
> >> -- 
> >> 2.27.0
> >>
> -- 
> Thanks
> Nitesh
> 





Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-22 Thread Nitesh Narayan Lal

On 9/22/20 6:08 AM, Frederic Weisbecker wrote:
> On Mon, Sep 21, 2020 at 11:16:51PM -0400, Nitesh Narayan Lal wrote:
>> On 9/21/20 7:40 PM, Frederic Weisbecker wrote:
>>> On Wed, Sep 09, 2020 at 11:08:16AM -0400, Nitesh Narayan Lal wrote:
 +/*
 + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
 + *
 + * This function returns the number of available housekeeping CPUs
 + * based on __num_housekeeping_cpus which is of type atomic_t
 + * and is initialized at the time of the housekeeping setup.
 + */
 +unsigned int num_housekeeping_cpus(void)
 +{
 +  unsigned int cpus;
 +
 +  if (static_branch_unlikely(&housekeeping_overridden)) {
 +  cpus = atomic_read(&__num_housekeeping_cpus);
 +  /* We should always have at least one housekeeping CPU */
 +  BUG_ON(!cpus);
 +  return cpus;
 +  }
 +  return num_online_cpus();
 +}
 +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
 +
  int housekeeping_any_cpu(enum hk_flags flags)
  {
int cpu;
 @@ -131,6 +153,7 @@ static int __init housekeeping_setup(char *str, enum 
 hk_flags flags)
  
housekeeping_flags |= flags;
  
 +  atomic_set(&__num_housekeeping_cpus, cpumask_weight(housekeeping_mask));
>>> So the problem here is that it takes the whole cpumask weight but you're 
>>> only
>>> interested in the housekeepers who take the managed irq duties I guess
>>> (HK_FLAG_MANAGED_IRQ ?).
>> IMHO we should also consider the cases where we only have nohz_full.
>> Otherwise, we may run into the same situation on those setups, do you agree?
> I guess it's up to the user to gather the tick and managed irq housekeeping
> together?

TBH I don't have a very strong case here at the moment.
But still, IMHO, this will force the user to have both managed irqs and
nohz_full in their environments to avoid these kinds of issues. Is that how
we would like to proceed?

The reason why I want to get this clarity is that going forward for any RT
related work I can form my thoughts based on this discussion.

>
> Of course that makes the implementation more complicated. But if this is
> called only on drivers initialization for now, this could be just a function
> that does:
>
> cpumask_weight(cpu_online_mask | housekeeping_cpumask(HK_FLAG_MANAGED_IRQ))

Ack, this makes more sense.

>
> And then can we rename it to housekeeping_num_online()?

It could be just me, but does something like hk_num_online_cpus() makes more
sense here?

>
> Thanks.
>
free_bootmem_cpumask_var(non_housekeeping_mask);
  
return 1;
 -- 
 2.27.0

>> -- 
>> Thanks
>> Nitesh
>>
>
>
-- 
Thanks
Nitesh



signature.asc
Description: OpenPGP digital signature


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-22 Thread Frederic Weisbecker
On Tue, Sep 22, 2020 at 09:50:55AM -0400, Nitesh Narayan Lal wrote:
> On 9/22/20 6:08 AM, Frederic Weisbecker wrote:
> TBH I don't have a very strong case here at the moment.
> But still, IMHO, this will force the user to have both managed irqs and
> nohz_full in their environments to avoid these kinds of issues. Is that how
> we would like to proceed?

Yep that sounds good to me. I never know how much we want to split each and any
of the isolation features but I'd rather stay cautious to separate HK_FLAG_TICK
from the rest, just in case running in nohz_full mode ever becomes interesting
alone for performance and not just latency/isolation.

But look what you can do as well:

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 5a6ea03f9882..9df9598a9e39 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -141,7 +141,7 @@ static int __init housekeeping_nohz_full_setup(char *str)
unsigned int flags;
 
flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
-   HK_FLAG_MISC | HK_FLAG_KTHREAD;
+   HK_FLAG_MISC | HK_FLAG_KTHREAD | HK_FLAG_MANAGED_IRQ;
 
return housekeeping_setup(str, flags);
 }


"nohz_full=" has historically gathered most wanted isolation features. It can
as well isolate managed irqs.


> > And then can we rename it to housekeeping_num_online()?
> 
> It could be just me, but does something like hk_num_online_cpus() makes more
> sense here?

Sure, that works as well.

Thanks.


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-22 Thread Nitesh Narayan Lal

On 9/22/20 4:58 PM, Frederic Weisbecker wrote:
> On Tue, Sep 22, 2020 at 09:50:55AM -0400, Nitesh Narayan Lal wrote:
>> On 9/22/20 6:08 AM, Frederic Weisbecker wrote:
>> TBH I don't have a very strong case here at the moment.
>> But still, IMHO, this will force the user to have both managed irqs and
>> nohz_full in their environments to avoid these kinds of issues. Is that how
>> we would like to proceed?
> Yep that sounds good to me. I never know how much we want to split each and 
> any
> of the isolation features but I'd rather stay cautious to separate 
> HK_FLAG_TICK
> from the rest, just in case running in nohz_full mode ever becomes interesting
> alone for performance and not just latency/isolation.

Fair point.

>
> But look what you can do as well:
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index 5a6ea03f9882..9df9598a9e39 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -141,7 +141,7 @@ static int __init housekeeping_nohz_full_setup(char *str)
>   unsigned int flags;
>  
>   flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
> - HK_FLAG_MISC | HK_FLAG_KTHREAD;
> + HK_FLAG_MISC | HK_FLAG_KTHREAD | HK_FLAG_MANAGED_IRQ;
>  
>   return housekeeping_setup(str, flags);
>  }
>
>
> "nohz_full=" has historically gathered most wanted isolation features. It can
> as well isolate managed irqs.

Nice, yeap this will work.

>
>
>>> And then can we rename it to housekeeping_num_online()?
>> It could be just me, but does something like hk_num_online_cpus() makes more
>> sense here?
> Sure, that works as well.

Thanks a lot for all the help.

>
> Thanks.
>
-- 
Nitesh



signature.asc
Description: OpenPGP digital signature


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-22 Thread Andrew Lunn
> Subject: Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of 
> hosekeeping CPUs

Hosekeeping? Are these CPUs out gardening in the weeds?

 Andrew


Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of hosekeeping CPUs

2020-09-22 Thread Nitesh Narayan Lal

On 9/22/20 5:26 PM, Andrew Lunn wrote:
>> Subject: Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of 
>> hosekeeping CPUs
> Hosekeeping? Are these CPUs out gardening in the weeds?

Bjorn has already highlighted the typo, so I will be fixing it in the next
version.
Do you find the commit message and body of this patch unclear?

>
>Andrew
>
-- 
Nitesh



signature.asc
Description: OpenPGP digital signature