Re: [PATCH v3 0/8] support for bitmap (and hence CPU) list "N" abbreviation

2021-01-27 Thread Paul Gortmaker
[Re: [PATCH v3 0/8] support for bitmap (and hence CPU) list "N" abbreviation] 
On 26/01/2021 (Tue 14:27) Yury Norov wrote:

> On Tue, Jan 26, 2021 at 9:12 AM Paul Gortmaker
>  wrote:
> >
> > This was on a 16 core machine with CONFIG_NR_CPUS=16 in .config file.
> >
> > Note that "N" is a dynamic quantity, and can change scope if the bitmap
> > is changed in size.  So at the risk of stating the obvious, don't use it
> > for "burn_eFuse=128-N" or "secure_erase_firmware=32-N" type stuff.
> 
> I think it's worth moving this sentence to the Documentation. Another

Dynamic nature comment added to Documentation

> caveat with
> N is that users' config may surprisingly become invalid, like if user
> says 32-N, and
> on some machine with a smaller bitmap this config fails to boot.

Updated example to indicate that "16-N" becomes invalid if moved from 32
core system to quad core.  I'm not currently able to think of an example
where boot will fail -- vs. a subsystem getting -EINVAL from bitmap code
and printing a subsystem error instead.

> It doesn't mean of course that I'm against 'N'. I think it's very
> useful especially in
> such common cases like "N", "0-N", "1-N".
> 
> Would it make sense to treat the mask "32-N" when N < 32 as N-N,
> and bark something in dmesg?

I don't think so.  For the same reasons you used to convince me -- that N
should be treated as just another number and not have special rules.

If I boot now, with "important_cpu="32-3" on a quad core then I get what
I get for being stupid.   We don't special case that and subsitute in a
"3-3" (which would then be "3") -- and nor should we!

Sticking to the CPU example, we have no idea what the caller's use case
is -- we don't know if NUMA stuff might be present and whether having
the single CPU #3 in that set is better or worse than EINVAL and no CPUs
in the set.   Expand that to bitmaps in general and we have no idea what
the "right" reaction to garbage input is.

The context of the caller could be simply test_bitmap.c itself -- which
would be expecting the EINVAL, and not some kind of "hot patching" of
the region in order to make it valid.

The only sane option is for the bitmap code to return EINVAL and let the
calling subsystem (with the appropriate context/info) make the decision
as to what to do next.  Which is what the series does now.

Paul.
--

> 
> > Paul.
> > ---
> >
> > [v1: 
> > https://lore.kernel.org/lkml/20210106004850.GA11682@paulmck-ThinkPad-P72/
> >
> > [v2: push code down from cpu subsys to core bitmap code as per
> >  Yury's comments.  Change "last" to simply be "N" as per PeterZ.]
> >  
> > https://lore.kernel.org/lkml/20210121223355.59780-1-paul.gortma...@windriver.com/
> >
> > [v3: Allow "N" to be used anywhere in the region spec, i.e. "N-N:N/N" vs.
> >  just being allowed at end of range like "0-N".  Add new self-tests.  Drop
> >  "all" and "none" aliases as redundant and not worth the extra 
> > complication. ]
> >
> > Cc: Li Zefan 
> > Cc: Ingo Molnar 
> > Cc: Yury Norov 
> > Cc: Thomas Gleixner 
> > Cc: Josh Triplett 
> > Cc: Peter Zijlstra 
> > Cc: "Paul E. McKenney" 
> > Cc: Frederic Weisbecker 
> > Cc: Rasmus Villemoes 
> > Cc: Andy Shevchenko 
> >
> > ---
> >
> > Paul Gortmaker (8):
> >   lib: test_bitmap: clearly separate ERANGE from EINVAL tests.
> >   lib: test_bitmap: add more start-end:offset/len tests
> >   lib: bitmap: fold nbits into region struct
> >   lib: bitmap: move ERANGE check from set_region to check_region
> >   lib: bitmap_getnum: separate arg into region and field
> >   lib: bitmap: support "N" as an alias for size of bitmap
> >   lib: test_bitmap: add tests for "N" alias
> >   rcu: deprecate "all" option to rcu_nocbs=
> >
> >  .../admin-guide/kernel-parameters.rst |  2 +
> >  .../admin-guide/kernel-parameters.txt |  4 +-
> >  kernel/rcu/tree_plugin.h  |  6 +--
> >  lib/bitmap.c  | 46 ++
> >  lib/test_bitmap.c | 48 ---
> >  5 files changed, 72 insertions(+), 34 deletions(-)
> >
> > --
> > 2.17.1
> >


Re: [PATCH v3 0/8] support for bitmap (and hence CPU) list "N" abbreviation

2021-01-26 Thread Yury Norov
On Tue, Jan 26, 2021 at 9:12 AM Paul Gortmaker
 wrote:
>
> The basic objective here was to add support for "nohz_full=8-N" and/or
> "rcu_nocbs="4-N" -- essentially introduce "N" as a portable reference
> to the last core, evaluated at boot for anything using a CPU list.
>
> The thinking behind this, is that people carve off a few early CPUs to
> support housekeeping tasks, and perhaps dedicate one to a busy I/O
> peripheral, and then the remaining pool of CPUs out to the end are a
> part of a commonly configured pool used for the real work the user
> cares about.
>
> Extend that logic out to a fleet of machines - some new, and some
> nearing EOL, and you've probably got a wide range of core counts to
> contend with - even though the early number of cores dedicated to the
> system overhead probably doesn't vary.
>
> This change would enable sysadmins to have a common bootarg across all
> such systems, and would also avoid any off-by-one fencepost errors that
> happen for users who might briefly forget that core counts start at zero.
>
> Originally I did this at the CPU subsys level, but Yury suggested it
> be moved down further to bitmap level itself, which made the core
> implementation [6/8] smaller and less complex, but the series longer.
>
> New self tests are added to better exercise what bitmap range/region
> currently supports, and new tests are added for the new "N" support.
>
> Also tested boot arg and the post-boot cgroup use case as per below:
>
>root@hackbox:~# cat /proc/cmdline
>BOOT_IMAGE=/boot/bzImage root=/dev/sda1 rcu_nocbs=2,3,8-N:1/2
>root@hackbox:~# dmesg|grep Offl
>rcu: Offload RCU callbacks from CPUs: 2-3,8,10,12,14.
>
>root@hackbox:/sys/fs/cgroup/cpuset/foo# cat cpuset.cpus
>
>root@hackbox:/sys/fs/cgroup/cpuset/foo# /bin/echo 10-N > cpuset.cpus
>root@hackbox:/sys/fs/cgroup/cpuset/foo# cat cpuset.cpus
>10-15
>root@hackbox:/sys/fs/cgroup/cpuset/foo# /bin/echo N-N:N/N > cpuset.cpus
>root@hackbox:/sys/fs/cgroup/cpuset/foo# cat cpuset.cpus
>15
>
> This was on a 16 core machine with CONFIG_NR_CPUS=16 in .config file.
>
> Note that "N" is a dynamic quantity, and can change scope if the bitmap
> is changed in size.  So at the risk of stating the obvious, don't use it
> for "burn_eFuse=128-N" or "secure_erase_firmware=32-N" type stuff.

I think it's worth moving this sentence to the Documentation. Another
caveat with
N is that users' config may surprisingly become invalid, like if user
says 32-N, and
on some machine with a smaller bitmap this config fails to boot.

It doesn't mean of course that I'm against 'N'. I think it's very
useful especially in
such common cases like "N", "0-N", "1-N".

Would it make sense to treat the mask "32-N" when N < 32 as N-N,
and bark something in dmesg?

> Paul.
> ---
>
> [v1: https://lore.kernel.org/lkml/20210106004850.GA11682@paulmck-ThinkPad-P72/
>
> [v2: push code down from cpu subsys to core bitmap code as per
>  Yury's comments.  Change "last" to simply be "N" as per PeterZ.]
>  
> https://lore.kernel.org/lkml/20210121223355.59780-1-paul.gortma...@windriver.com/
>
> [v3: Allow "N" to be used anywhere in the region spec, i.e. "N-N:N/N" vs.
>  just being allowed at end of range like "0-N".  Add new self-tests.  Drop
>  "all" and "none" aliases as redundant and not worth the extra complication. ]
>
> Cc: Li Zefan 
> Cc: Ingo Molnar 
> Cc: Yury Norov 
> Cc: Thomas Gleixner 
> Cc: Josh Triplett 
> Cc: Peter Zijlstra 
> Cc: "Paul E. McKenney" 
> Cc: Frederic Weisbecker 
> Cc: Rasmus Villemoes 
> Cc: Andy Shevchenko 
>
> ---
>
> Paul Gortmaker (8):
>   lib: test_bitmap: clearly separate ERANGE from EINVAL tests.
>   lib: test_bitmap: add more start-end:offset/len tests
>   lib: bitmap: fold nbits into region struct
>   lib: bitmap: move ERANGE check from set_region to check_region
>   lib: bitmap_getnum: separate arg into region and field
>   lib: bitmap: support "N" as an alias for size of bitmap
>   lib: test_bitmap: add tests for "N" alias
>   rcu: deprecate "all" option to rcu_nocbs=
>
>  .../admin-guide/kernel-parameters.rst |  2 +
>  .../admin-guide/kernel-parameters.txt |  4 +-
>  kernel/rcu/tree_plugin.h  |  6 +--
>  lib/bitmap.c  | 46 ++
>  lib/test_bitmap.c | 48 ---
>  5 files changed, 72 insertions(+), 34 deletions(-)
>
> --
> 2.17.1
>