On Thu, May 17, 2018 at 04:55:41PM -0400, Waiman Long wrote:
> A new cpuset.sched.domain boolean flag is added to cpuset v2. This new
> flag indicates that the CPUs in the current cpuset should be treated
> as a separate scheduling domain.

The traditional name for this is a partition.

>                                  This new flag is owned by the parent
> and will cause the CPUs in the cpuset to be removed from the effective
> CPUs of its parent.

This is a significant departure from existing behaviour, but one I can
appreciate. I don't immediately see something terribly wrong with it.

> This is implemented internally by adding a new isolated_cpus mask that
> holds the CPUs belonging to child scheduling domain cpusets so that:
> 
>       isolated_cpus | effective_cpus = cpus_allowed
>       isolated_cpus & effective_cpus = 0
> 
> This new flag can only be turned on in a cpuset if its parent is either
> root or a scheduling domain itself with non-empty cpu list. The state
> of this flag cannot be changed if the cpuset has children.
> 
> Signed-off-by: Waiman Long <long...@redhat.com>
> ---
>  Documentation/cgroup-v2.txt |  22 ++++
>  kernel/cgroup/cpuset.c      | 237 
> +++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 256 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> index cf7bac6..54d9e22 100644
> --- a/Documentation/cgroup-v2.txt
> +++ b/Documentation/cgroup-v2.txt
> @@ -1514,6 +1514,28 @@ Cpuset Interface Files
>       it is a subset of "cpuset.mems".  Its value will be affected
>       by memory nodes hotplug events.
>  
> +  cpuset.sched.domain
> +     A read-write single value file which exists on non-root
> +     cpuset-enabled cgroups.  It is a binary value flag that accepts
> +     either "0" (off) or a non-zero value (on).

I would be conservative and only allow 0/1.

>                                                  This flag is set
> +     by the parent and is not delegatable.
> +
> +     If set, it indicates that the CPUs in the current cgroup will
> +     be the root of a scheduling domain.  The root cgroup is always
> +     a scheduling domain.  There are constraints on where this flag
> +     can be set.  It can only be set in a cgroup if all the following
> +     conditions are true.
> +
> +     1) The parent cgroup is also a scheduling domain with a non-empty
> +        cpu list.

Ah, so initially I was confused by the requirement for root to have it
always set, but you'll allow child domains to steal _all_ CPUs, such
that root ends up with an empty effective set?

What about the (kernel) threads that cannot be moved out of the root
group?

> +     2) The list of CPUs are exclusive, i.e. they are not shared by
> +        any of its siblings.

Right.

> +     3) There is no child cgroups with cpuset enabled.
> +
> +     Setting this flag will take the CPUs away from the effective
> +     CPUs of the parent cgroup. Once it is set, this flag cannot be
> +     cleared if there are any child cgroups with cpuset enabled.

This I'm not clear on. Why?


Reply via email to