On 09/17, Oleg Nesterov wrote: > > I seem to find the problem, and I do not see a simple solution. > Will try to think more tomorrow with a fresh head, but perhaps > you can help. looks like we need some changes in utrace/signal > layer. > > Suppose that a tracee is going to report, say, PTRACE_EVENT_FORK. > The callback returns UTRACE_STOP. Of course, utrace_report_xxx() > does not do finish_resume_report() or utrace_stop(), we rely on > utrace_resume() or utrace_get_signal() which should call utrace_stop() > eventually. > > Now suppose that the tracee's sub-thread initiates the group-stop. > In this case the tracee will call do_signal_stop(), without doing > utrace_resume/utrace_get_signal. And this means the tracee will > stop in TASK_STOPPED with the wrong ->exit_code. > > ptrace can hook ->report_jctl(), but this can't help. We can do > > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -1569,6 +1569,10 @@ static int do_signal_stop(int signr) > signal_wake_up(t, 0); > } > } > + > + // tracehook_notify_jctl() can change ->exit_code. > + // If we won't stop ->exit_code will be cleared anyway > + current->exit_code = sig->group_exit_code; > /* > * If there are no other threads in the group, or if there is > * a group stop in progress and we are the last to stop, report > @@ -1584,7 +1588,6 @@ static int do_signal_stop(int signr) > if (sig->group_stop_count) { > if (!--sig->group_stop_count) > sig->flags = SIGNAL_STOP_STOPPED; > - current->exit_code = sig->group_exit_code; > __set_current_state(TASK_STOPPED); > } > spin_unlock_irq(¤t->sighand->siglock); > > But this doesn't really help too. tracehook_notify_jctl() unlocks > ->siglock, if SIGCONT comes in between we can't notice it. In this > case the stop reports will be wrong again. > > What do you think?
I am starting to think the best option is to call utrace_get_signal() before "if (unlikely(signal->group_stop_count > 0)" check. utrace_get_signal should check ->group_stop_count before dequeue_signal(), if it is true we pass the special event/result ->report_signal(). Oleg.