On Tue, 12 May 2020, Oleg Nesterov wrote:
On 05/11, Davidlohr Bueso wrote:
Currently the tasklist_lock is shared mainly in order to observe
the list atomically for the PRIO_PGRP and PRIO_USER cases, as
the actual lookups are already rcu-safe,
not really...
do_each_pid_task(PIDTYPE_PGID) can race with change_pid(PIDTYPE_PGID)
which moves the task from one hlist to another. Yes, it is safe in
that task_struct can't go away. But still this is not right because
do_each_pid_task() can scan the wrong (2nd) hlist.
Hmm I didn't think about this case, I guess this is also busted in
ioprio_get(2) then.
(ii) exit (deletion), this window is small but if a task is
deleted with the highest nice and it is not observed this would
cause a change in return semantics. To further reduce the window
we ignore any tasks that are PF_EXITING in the 'old' version of
the list.
can't understand...
could you explain in details why do you think this PF_EXITING check
makes any sense?
My logic was that if the task with the highest prio exited while we
were iterating the list, it would not be necessarily seen with rcu
and the syscall would return the highest prio of a task that exited;
and checking against PF_EXITING was a way to ignore such scenarios
as we were going to race with it anyway.
At this point it seems that we can just remove the lock for the
PRIO_PROCESS case.
Thanks,
Davidlohr