The previous patch fixed the deadlock when mt-exec waits for debugger which should reap a zombie thread, but we can hit the same problem if the killed sub-thread stops in ptrace_event(PTRACE_EVENT_EXIT). Change may_ptrace_stop() to check signal->group_exit_task.
This is a user-visible change. But hopefully it can't break anything. Note that the semantics of PTRACE_EVENT_EXIT was never really defined, it depends on /dev/random. Just for example, currently a sub-thread killed by exec will stop, but if it exits on its own and races with exec it will not stop, so nobody can rely on PTRACE_EVENT_EXIT anyway. We really need to finally define what PTRACE_EVENT_EXIT should actually do, but this needs other changes. Signed-off-by: Oleg Nesterov <[email protected]> --- kernel/signal.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index b78ce63..16e44d7 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1738,6 +1738,20 @@ static inline int may_ptrace_stop(void) { if (!likely(current->ptrace)) return 0; + + /* + * Other checks can only affect PTRACE_EVENT_EXIT, a task with the + * pending SIGKILL can't block in TASK_TRACED. But PTRACE_EVENT_EXIT + * can be reported after SIGKILL was already dequeued. + */ + + /* + * Coredump or exec. We must not stop in the latter case, debuger can + * wait for cred_guard_mutex held by execing thread. + */ + if (current->signal->group_exit_task) + return 0; + /* * Are we in the middle of do_coredump? * If so and our tracer is also part of the coredump stopping @@ -1746,10 +1760,6 @@ static inline int may_ptrace_stop(void) * If SIGKILL was already sent before the caller unlocked * ->siglock we must see ->core_state != NULL. Otherwise it * is safe to enter schedule(). - * - * This is almost outdated, a task with the pending SIGKILL can't - * block in TASK_TRACED. But PTRACE_EVENT_EXIT can be reported - * after SIGKILL was already dequeued. */ if (unlikely(current->mm->core_state) && unlikely(current->mm == current->parent->mm)) -- 2.5.0

