Re: resume from another engine results in loss of singlestep req.

Oleg Nesterov Wed, 12 Aug 2009 03:40:15 -0700

On 08/11, Roland McGrath wrote:
>
> I think this might be the right fix.  First, we change the order so that
> UTRACE_INTERRUPT prevails over UTRACE_REPORT.  (I'm really not sure why I
> ever had it the other way around.)


and this is consistent with the behaviour utrace_resume() which does nothing
if utrace->interrupt == T, it just returns and relies on utrace_get_signal().

But I wonder if this always correct, please see below.

> Next, we keep track of not only whether
> one engine wanted a report when another wanted stop, but of the prevailing
> resume action (i.e. least value still > UTRACE_STOP).  After stopping and
> resuming, we apply UTRACE_INTERRUPT or UTRACE_REPORT as needed to make sure
> we get the kind of resume report we need.

Agreed, this looks like a right change to me. In any case I think this makes
the code more understandable and logical, and ->resume_action is much better
than ->reports imho.

But, you seem to forget to fix other callers of utrace_stop(),

        utrace_report_syscall_entry
        utrace_finish_vfork
        utrace_report_exit

they all do utrace_stop(false), and with this change this "false" means
UTRACE_STOP.

> @@ -1298,14 +1305,14 @@ static void finish_report(struct utrace_
>  {
>       bool clean = (report->takers && !report->detaches);
>
> -     if (report->action <= UTRACE_REPORT && !utrace->report) {
> -             spin_lock(&utrace->lock);
> -             utrace->report = 1;
> -             set_tsk_thread_flag(task, TIF_NOTIFY_RESUME);
> -     } else if (report->action == UTRACE_INTERRUPT && !utrace->interrupt) {
> +     if (report->action == UTRACE_INTERRUPT && !utrace->interrupt) {
>               spin_lock(&utrace->lock);
>               utrace->interrupt = 1;
>               set_tsk_thread_flag(task, TIF_SIGPENDING);
> +     } else if (report->action <= UTRACE_REPORT && !utrace->report) {
> +             spin_lock(&utrace->lock);
> +             utrace->report = 1;
> +             set_tsk_thread_flag(task, TIF_NOTIFY_RESUME);

Can't we (well, partly) miss UTRACE_STOP request?

Two engines A and B. Say, utrace_report_exec() calls ->report_exec().

A returns UTRACE_STOP, B returns UTRACE_INTERRUPT.

->resume_action = UTRACE_INTERRUPT, finish_report() sets ->interrupt = 1,
utrace_report_exec() returns.

Now. We can't rely on utrace_resume()->finish_resume_report() which can
notice the stop request and do utrace_stop() to actually stop. This is
delayed until utrace_get_signal()->finish_resume_report(), and I am not
sure this is what we really want:

        - minor, but it is possible that tracehook_notify_jctl() will
          be called before tracehook_get_signal(). Not a real problem,
          but shouldn't UTRACE_STOP mean "stop as soon as possible" ?

        - utrace_get_signal() can return without finish_resume_report()
          if B->report_signal() (say) returns UTRACE_SIGNAL_DELIVER.

          Yes, in this case A->report_signal() can return UTRACE_STOP
          again, this means utrace_get_signal() returns with ->report = T.
          But another UTRACE_INTERRUPT can set ->interrupt, this can
          "disable" ->report again, and so on.

In short: unless I missed something, it is hardly possible to predict when
the tracee will actually stop, perhaps UTRACE_STOP needs more respect?

Oleg.

Re: resume from another engine results in loss of singlestep req.

Reply via email to