On Tue, Feb 11, 2025 at 02:24:36PM +0800, Yafang Shao wrote:
>  void klp_try_complete_transition(void)
>  {
> +     unsigned long timeout, proceed_pending_processes;
>       unsigned int cpu;
>       struct task_struct *g, *task;
>       struct klp_patch *patch;
> @@ -467,9 +468,30 @@ void klp_try_complete_transition(void)
>        * unless the patch includes changes to a very common function.
>        */
>       read_lock(&tasklist_lock);
> -     for_each_process_thread(g, task)
> +     timeout = jiffies + HZ;
> +     proceed_pending_processes = 0;
> +     for_each_process_thread(g, task) {
> +             /* check if this task has already switched over */
> +             if (task->patch_state == klp_target_state)
> +                     continue;
> +
> +             proceed_pending_processes++;
> +
>               if (!klp_try_switch_task(task))
>                       complete = false;
> +
> +             /*
> +              * Prevent hardlockup by not blocking tasklist_lock for too 
> long.
> +              * But guarantee the forward progress by making sure at least
> +              * some pending processes were checked.
> +              */
> +             if (rwlock_is_contended(&tasklist_lock) &&
> +                 time_after(jiffies, timeout) &&
> +                 proceed_pending_processes > 100) {
> +                     complete = false;
> +                     break;
> +             }
> +     }
>       read_unlock(&tasklist_lock);

Instead of all this can we not just use rcu_read_lock() instead of
tasklist_lock?

Petr, I know you mentioned that would widen the race window for the
do_exit() path, but don't we need to fix that race anyway?

-- 
Josh

Reply via email to