On 10/15, Roland McGrath wrote:
>
> > Tracee, finish_callback() path:
> >
> >     if (action == UTRACE_DETACH)
> >             engine->ops = utrace_detached_ops;
> >
> >     utrace->reporting = NULL;
> >
> > no barries, no utrace->lock() in between.
> >
> > Tracer, utrace_barrier() under utrace->lock:
> >
> >     if (engine->ops == utrace_detached_ops)
> >             return -EXXX;
> >
> >     if (utrace->reporting != engine)
> >             return 0;
> >
> > Afaics, this is racy and should be fixed in utrace level.
>
> Let me see if I'm following you.  The scenario you mean is that
> utrace_barrier() returns zero when there had been a callback in
> progress but it returned UTRACE_DETACH, and so now the engine is
> detached.  The only guarantee utrace_barrier() claims is ordering
> with respect to the effects of prior utrace_* calls, and a callback
> not being in progress right now.  This scenario meets that interface
> contract as stated.  It's as if utrace_barrier() had returned zero
> before that callback ran.
>
> But you can tell by external means that your callback has started
> before you called utrace_barrier(), so you want it to return -ESRCH
> instead of 0.  Is that right?

I think this is right, but I'd like to be absolutely sure we understand
each other, please see below. As I said, this is very similar to the
problems we discussed in another thread. But this particular race can
be solved (I think) if we change finish_callback_report() to set
->ops = utrace_detached_ops under utrace->lock. In fact I feel this
makes sense anyway, but I need to reread this code with a fresh head.

So. Let's suppose again we have something like

        bool please_detach;

        my_report_callback(...)
        {
                if (please_detach)
                        return UTRACE_DETACH;

                ...
                return UTRACE_WHATEVER_ELSE;
        }

The tracer does:

        please_detach = false;

        err = utrace_barrier();
        if (err)
                // too late!
                return -ESRCH;

        // Great! we cancelled the possible detach.
        // the next invocation of my_report_callback()
        // must see please_detach == false.

        // if this callback was already called and returned
        // UTRACE_DETACH, utrace_barrier() shouldn't succeed.

        // Seriously, if utrace_barrier() was called when the
        // tracee has already started to handle UTRACE_DETACH
        // request inside the reporting loop, a naive user like
        // me would like to expect utrace_barrier() should catch
        // this detach-in-progress somehow.

If it is not supposed utrace_barrier() can be used this way, then I don't
understand any longer what did you mean in
https://www.redhat.com/archives/utrace-devel/2009-October/msg00135.html

        > No, the return calue from utrace_barrier() does not matter. And, if 
the
        > tracee is killed we don't care. The race is different.

        The utrace_barrier return value should cover exactly the case you 
describe,
        ...
        A success return tells you positively that it was not detached.

That is why initially I did

        please_detach = false;

        utrace_barrier(); // we can't trust the returned value

        if (engine->ops != ptrace_utrace_ops)
                return -EXXX;

        // Great! we cancelled the possible detach.

        // the next invocation of my_report_callback()
        // must see please_detach == false.

But. even the "must see" above needs some barriers, afaics. Either in utrace,
or the user should take care.

So far I think it is better to modify utrace.

Oleg.

Reply via email to