On Thu, Apr 15, 2021 at 07:59:41PM +0530, Charan Teja Reddy wrote:
> psi_group_cpu->tasks, represented by the unsigned int, stores the number
> of tasks that could be stalled on a psi resource(io/mem/cpu).
> Decrementing these counters at zero leads to wrapping which further
> leads to the psi_group_cpu->state_mask is being set with the respective
> pressure state. This could result into the unnecessary time sampling for
> the pressure state thus cause the spurious psi events. This can further
> lead to wrong actions being taken at the user land based on these psi
> events.
> Though psi_bug is set under these conditions but that just for debug
> purpose. Fix it by decrementing the ->tasks count only when it is
> non-zero.

Makes sense, it's more graceful in the event of a bug.

But what motivates this change? Is it something you hit recently with
an upstream kernel and we should investigate?

> Signed-off-by: Charan Teja Reddy <chara...@codeaurora.org>
> ---
>  kernel/sched/psi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 967732c..f925468 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -718,7 +718,8 @@ static void psi_group_change(struct psi_group *group, int 
> cpu,
>                                       groupc->tasks[3], clear, set);
>                       psi_bug = 1;
>               }
> -             groupc->tasks[t]--;
> +             if (groupc->tasks[t])
> +                     groupc->tasks[t]--;

There is already a branch on the tasks to signal the bug. How about:

                if (groupc->tasksk[t]) {
                        groupc->tasks[t]--;
                } else if (!psi_bug) {
                        printk_deferred(...
                        psi_bug = 1;
                }

Reply via email to