cpuset: Don't fail cpuset.cpus change in v2

Chen Ridong Sun, 04 Jan 2026 23:00:58 -0800

On 2026/1/5 11:59, Waiman Long wrote:
> On 1/4/26 8:35 PM, Chen Ridong wrote:
>>
>> On 2026/1/5 5:48, Waiman Long wrote:
>>> On 1/4/26 2:09 AM, Chen Ridong wrote:
>>>> On 2026/1/2 3:15, Waiman Long wrote:
>>>>> Commit fe8cd2736e75 ("cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE
>>>>> until valid partition") introduced a new check to disallow the setting
>>>>> of a new cpuset.cpus.exclusive value that is a superset of a sibling's
>>>>> cpuset.cpus value so that there will at least be one CPU left in the
>>>>> sibling in case the cpuset becomes a valid partition root. This new
>>>>> check does have the side effect of failing a cpuset.cpus change that
>>>>> make it a subset of a sibling's cpuset.cpus.exclusive value.
>>>>>
>>>>> With v2, users are supposed to be allowed to set whatever value they
>>>>> want in cpuset.cpus without failure. To maintain this rule, the check
>>>>> is now restricted to only when cpuset.cpus.exclusive is being changed
>>>>> not when cpuset.cpus is changed.
>>>>>
>>>> Hi, Longman,
>>>>
>>>> You've emphasized that modifying cpuset.cpus should never fail. While I 
>>>> haven't found this
>>>> explicitly documented. Should we add it?
>>>>
>>>> More importantly, does this mean the "never fail" rule has higher priority 
>>>> than the exclusive CPU
>>>> constraints? This seems to be the underlying assumption in this patch.
>>> Before the introduction of cpuset partition, writing to cpuset.cpus will 
>>> only fail if the cpu list
>>> is invalid like containing CPUs outside of the valid cpu range. What I mean 
>>> by "never-fail" is that
>>> if the cpu list is valid, the write action should not fail. The rule is not 
>>> explicitly stated in the
>>> documentation, but it is a pre-existing behavior which we should try to 
>>> keep to avoid breaking
>>> existing applications.
>>>
>> There are two conditions that can cause a cpuset.cpus write operation to 
>> fail: ENOSPC (No space left
>> on device) and EBUSY.
>>
>> I just want to ensure the behavior aligns with our design intent.
>>
>> Consider this example:
>>
>> # cd /sys/fs/cgroup/
>> # mkdir test
>> # echo 1 > test/cpuset.cpus
>> # echo $$ > test/cgroup.procs
>> # echo 0 > /sys/devices/system/cpu/cpu1/online
>> # echo > test/cpuset.cpus
>> -bash: echo: write error: No space left on device
>>
>> In cgroups v2, if the test cgroup becomes empty, it could inherit the 
>> parent's effective CPUs. My
>> question is: Should we still fail to clear cpuset.cpus (returning an error) 
>> when the cgroup is
>> populated?
> 
> Good catch. This error is for v1. It shouldn't apply for v2. Yes, I think we 
> should fix that for v2.
> 

The EBUSY check (through cpuset_cpumask_can_shrink) is necessary, correct?

Since the subsequent patch modifies exclusive checking for v1, should we 
consolidate all v1-related
code into a separate function like cpuset1_validate_change() (maybe come 
duplicate code)?, it would
allow us to isolate v1 logic and avoid having to account for v1 implementation 
details in future
features.

In other words:

validate_change(...)
{
    if (!is_in_v2_mode())
        return cpuset1_validate_change(cur, trial);
    ...
    // only v2 code here
}

-- 
Best regards,
Ridong
Re: [cgroup/for-6.20 PATCH v2 3/4] cgroup/cpuset: Don't fail cpuset.cpus change in v2

Reply via email to