On 09/17, Oleg Nesterov wrote:
>
> I seem to find the problem, and I do not see a simple solution.
> Will try to think more tomorrow with a fresh head, but perhaps
> you can help. looks like we need some changes in utrace/signal
> layer.
>
> Suppose that a tracee is going to report, say, PTRACE_EVENT_FORK.
> The callback returns UTRACE_STOP. Of course, utrace_report_xxx()
> does not do finish_resume_report() or utrace_stop(), we rely on
> utrace_resume() or utrace_get_signal() which should call utrace_stop()
> eventually.
>
> Now suppose that the tracee's sub-thread initiates the group-stop.
> In this case the tracee will call do_signal_stop(), without doing
> utrace_resume/utrace_get_signal. And this means the tracee will
> stop in TASK_STOPPED with the wrong ->exit_code.
>
> ptrace can hook ->report_jctl(), but this can't help. We can do
>
>       --- a/kernel/signal.c
>       +++ b/kernel/signal.c
>       @@ -1569,6 +1569,10 @@ static int do_signal_stop(int signr)
>                                       signal_wake_up(t, 0);
>                               }
>               }
>       +
>       +       // tracehook_notify_jctl() can change ->exit_code.
>       +       // If we won't stop ->exit_code will be cleared anyway
>       +       current->exit_code = sig->group_exit_code;
>               /*
>                * If there are no other threads in the group, or if there is
>                * a group stop in progress and we are the last to stop, report
>       @@ -1584,7 +1588,6 @@ static int do_signal_stop(int signr)
>               if (sig->group_stop_count) {
>                       if (!--sig->group_stop_count)
>                               sig->flags = SIGNAL_STOP_STOPPED;
>       -               current->exit_code = sig->group_exit_code;
>                       __set_current_state(TASK_STOPPED);
>               }
>               spin_unlock_irq(&current->sighand->siglock);
>
> But this doesn't really help too. tracehook_notify_jctl() unlocks
> ->siglock, if SIGCONT comes in between we can't notice it. In this
> case the stop reports will be wrong again.
>
> What do you think?

I am starting to think the best option is to call utrace_get_signal()
before "if (unlikely(signal->group_stop_count > 0)" check.

utrace_get_signal should check ->group_stop_count before dequeue_signal(),
if it is true we pass the special event/result ->report_signal().

Oleg.

Reply via email to