Hi. On Thu, Jan 01, 2026 at 02:15:58PM -0500, Waiman Long <[email protected]> wrote: > Currently, when setting a cpuset's cpuset.cpus to a value that conflicts > with the cpuset.cpus/cpuset.cpus.exclusive of a sibling partition, > the sibling's partition state becomes invalid. This is overly harsh and > is probably not necessary. > > The cpuset.cpus.exclusive control file, if set, will override the > cpuset.cpus of the same cpuset when creating a cpuset partition. > So cpuset.cpus has less priority than cpuset.cpus.exclusive in setting up > a partition. However, it cannot override a conflicting cpuset.cpus file > in a sibling cpuset and the partition creation process will fail. This > is inconsistent. That will also make using cpuset.cpus.exclusive less > valuable as a tool to set up cpuset partitions as the users have to > check if such a cpuset.cpus conflict exists or not. > > Fix these problems by strictly adhering to the setting of the > following control files in descending order of priority when setting > up a partition. > > 1. cpuset.cpus.exclusive.effective of a valid partition > 2. cpuset.cpus.exclusive > 3. cpuset.cpus
> > So once a cpuset.cpus.exclusive is set without failure, it will > always be allowed to form a valid partition as long as at least one > CPU can be granted from its parent irrespective of the state of the > siblings' cpuset.cpus values. Of course, setting cpuset.cpus.exclusive > will fail if it conflicts with the cpuset.cpus.exclusive or the > cpuset.cpus.exclusive.effective value of a sibling. Concept question: When a/b/cpuset.cpus.exclusive ⊂ a/b/cpuset.cpus (proper subset) and a/b/cpuset.cpus.partition == root, a/cpuset.cpus.partition == root (b is valid partition) should a/b/cpuset.cpus.exclusive.effective be equal to cpuset.cpus (as all of them happen to be exclusive) or "only" cpuset.cpus.exclusive? > Partition can still be created by setting only cpuset.cpus without > setting cpuset.cpus.exclusive. However, any conflicting CPUs in sibling's > cpuset.cpus.exclusive.effective and cpuset.cpus.exclusive values will > be removed from its cpuset.cpus.exclusive.effective as long as there > is still one or more CPUs left and can be granted from its parent. This > CPU stripping is currently done in rm_siblings_excl_cpus(). > > The new code will now try its best to enable the creation of new > partitions with only cpuset.cpus set without invalidating existing ones. OK. (After I re-learnt benefits of remote partitions or more precisely cpuset.cpus.effective.) > However it is not guaranteed that all the CPUs requested in cpuset.cpus > will be used in the new partition even when all these CPUs can be > granted from the parent. > > This is similar to the fact that cpuset.cpus.effective may not be > able to include all the CPUs requested in cpuset.cpus. In this case, > the parent may not able to grant all the exclusive CPUs requested in > cpuset.cpus to cpuset.cpus.exclusive.effective if some of them have > already been granted to other partitions earlier. > > With the creation of multiple sibling partitions by setting > only cpuset.cpus, this does have the side effect that their exact > cpuset.cpus.exclusive.effective settings will depend on the order of > partition creation if there are conflicts. Due to the exclusive nature > of the CPUs in a partition, it is not easy to make it fair other than > the old behavior of invalidating all the conflicting partitions. > > For example, > # echo "0-2" > A1/cpuset.cpus > # echo "root" > A1/cpuset.cpus.partition > # echo A1/cpuset.cpus.partition > root > # echo A1/cpuset.cpus.exclusive.effective > 0-2 > # echo "2-4" > B1/cpuset.cpus > # echo "root" > B1/cpuset.cpus.partition > # echo B1/cpuset.cpus.partition > root > # echo B1/cpuset.cpus.exclusive.effective > 3-4 > # echo B1/cpuset.cpus.effective > 3-4 > > For users who want to be sure that they can get most of the CPUs they > want, Slightly OT but I'd say that users want: a) confinement (some cpuset.cpus in leaves) b) isolation (cpuset.cpus.exclusive in leaves) c) hierarchical organization - confinment generalizes OK - children can only claim what parent allowed Conflicting exclusivity configs should be no users intention or a want :-p > cpuset.cpus.exclusive should be used instead if they can set > it successfully without failure. Setting cpuset.cpus.exclusive will > guarantee that sibling conflicts from then onward is no longer possible. I think the background idea of the paragraph (shift away from local to remote partitions, also mentioned the other day) could be somehow fitted into the Documentation/ hunks. > diff --git a/Documentation/admin-guide/cgroup-v2.rst > b/Documentation/admin-guide/cgroup-v2.rst > ... > @@ -2632,6 +2641,9 @@ Cpuset Interface Files > > The root cgroup is always a partition root and its state cannot > be changed. All other non-root cgroups start out as "member". > + Even though the "cpuset.cpus.exclusive*" control files are not > + present in the root cgroup, they are implicitly the same as > + "cpuset.cpus". Even "cpuset.cpus" have CFTYPE_NOT_ON_ROOT, so this formulation might be confusing. Maybe it's same as "cpuset.cpus.effective"? Thanks, Michal
signature.asc
Description: PGP signature
