On Wed, Jul 17, 2019 at 07:55:57PM +0200, Christian Brauner wrote: > On Wed, Jul 17, 2019 at 01:21:00PM -0400, Joel Fernandes wrote: > > From: Suren Baghdasaryan <[email protected]> > > > > There is a race between reading task->exit_state in pidfd_poll and writing > > it after do_notify_parent calls do_notify_pidfd. Expected sequence of > > events is: > > > > CPU 0 CPU 1 > > ------------------------------------------------ > > exit_notify > > do_notify_parent > > do_notify_pidfd > > tsk->exit_state = EXIT_DEAD > > pidfd_poll > > if (tsk->exit_state) > > > > However nothing prevents the following sequence: > > > > CPU 0 CPU 1 > > ------------------------------------------------ > > exit_notify > > do_notify_parent > > do_notify_pidfd > > pidfd_poll > > if (tsk->exit_state) > > tsk->exit_state = EXIT_DEAD > > > > This causes a polling task to wait forever, since poll blocks because > > exit_state is 0 and the waiting task is not notified again. A stress > > test continuously doing pidfd poll and process exits uncovered this bug, > > and the below patch fixes it. > > > > To fix this, we set tsk->exit_state before calling do_notify_pidfd. > > > > Cc: [email protected] > > Signed-off-by: Suren Baghdasaryan <[email protected]> > > Signed-off-by: Joel Fernandes (Google) <[email protected]> > > That means in such a situation other users will see EXIT_ZOMBIE where > they didn't see that before until after the parent failed to get > notified. > > That's a rather subtle internal change. I was worried about > __ptrace_detach() since it explicitly checks for EXIT_ZOMBIE but it > seems to me that this is fine since we hold write_lock_irq(&tasklist_lock); > at the point when we do set p->exit_signal.
Right. > Acked-by: Christian Brauner <[email protected]> Thanks. > Once Oleg confirms that I'm right not to worty I'll pick this up. Ok.

