On Wed, Mar 03, 2021 at 11:46:59AM +0800, Chengming Zhou wrote: > The commit 36b238d57172 ("psi: Optimize switching tasks inside shared > cgroups") only update cgroups whose state actually changes during a > task switch only in task preempt case, not in task sleep case. > > We actually don't need to clear and set TSK_ONCPU state for common cgroups > of next and prev task in sleep case, that can save many psi_group_change > especially when most activity comes from one leaf cgroup. > > sleep before: > psi_dequeue() > while ((group = iterate_groups(prev))) # all ancestors > psi_group_change(prev, .clear=TSK_RUNNING|TSK_ONCPU) > psi_task_switch() > while ((group = iterate_groups(next))) # all ancestors > psi_group_change(next, .set=TSK_ONCPU) > > sleep after: > psi_dequeue() > nop > psi_task_switch() > while ((group = iterate_groups(next))) # until (prev & next) > psi_group_change(next, .set=TSK_ONCPU) > while ((group = iterate_groups(prev))) # all ancestors > psi_group_change(prev, .clear=common?TSK_RUNNING:TSK_RUNNING|TSK_ONCPU) > > When a voluntary sleep switches to another task, we remove one call of > psi_group_change() for every common cgroup ancestor of the two tasks. > Co-developed-by: Muchun ? > Signed-off-by: Muchun Song <songmuc...@bytedance.com> > Signed-off-by: Chengming Zhou <zhouchengm...@bytedance.com>