> Now. we should make sure that every user of UTRACE_DETACH frees
> engine->data. And, we have to introduce ptrace_report_reap() which
> should free it too.

Right, or you can have a report_death callback (or even just report_quiesce
when passed UTRACE_EVENT(DEATH)) that always detaches.  In that event you
don't really need report_reap.  (In theory, this is how ptrace would be.
It doesn't have any use for utrace once a task has died.  But in practice
it depends entirely on how the wait interference is done--that is outside
of utrace proper, but you could conceivably want to use report_reap as the
mechanism for synchronizing with de_thread(), which is the one place that
ordinarily preempts the ptrace wait interference.)

> Note that this itself adds some complications. But the real problem
> is: the tracer can never safely dereference engine->data, the tracee
> can be killed/released at any moment.

I see two basic models for handling the engine lifetime issues in a utrace
engine.

One way is where you do not normally hold a utrace_engine reference.
You get an initial one from UTRACE_ATTACH_CREATE when you do setup.
Once you have done utrace_set_events() and maybe utrace_control(), you
do utrace_engine_put() immediately.  Then if you have callbacks that
use UTRACE_DETACH (or else have report_reap, where it's implicit) they
are responsible for engine->data.  And indeed you must have some such
if you have an engine->data pointer that needs cleanup.  This model is
probably a good fit for "passive" tracing, where the only thing you
really come back and do later is disable it (or for something simple
enough it doesn't need any cleanup for engine->data).  Then you can do:

        engine = utrace_attach_task(task, UTRACE_ATTACH_MATCH_OPS,
                                   &my_ops, NULL);
        if (IS_ERR(engine)) ...
        ret = utrace_control(task, engine, UTRACE_DETACH);
        if (ret) ...
        clean_up(engine->data);

where -ESRCH/-EALREADY tell you that your report_reap/report_death
(also counts for report_quiesce in the report_death pass) either will
run, has run, or is running, so in that case you rely on them to do:

        clean_up(engine->data);
        return UTRACE_DETACH;

The logic is the same for "disable" rather than "detach", using
utrace_set_events(task, engine, 0) rather than utrace_control().  
The same return values tell you if there is a death race.  In either
case, you'd follow up with utrace_barrier() for -EINPROGRESS unless
the only kind of race that's of concern for your case is a death race
(such as when you're sure it was stopped in a place where you could
get no other callbacks, or you don't have other callbacks anyway).

The second model is where you normally hold a utrace_engine reference
to represent the life of your data structure, separate from the
implicit utrace_engine reference for being attached.  The engine uses
the rule that the code that drops the last engine reference cleans up
the data structure, rather than the code that detaches doing that.
You might have whatever callbacks that decide to use UTRACE_DETACH if
that's appropriate.  But these would not necessarily free your data
structures.  Instead, the control path does that well after detach,
when it does the final utrace_engine_put().

This model seems like a better fit for ptrace to me.  The anchoring data
structure is the tracer's tracees list, which links together the
ptrace_context structs.  A live ptrace_context has an engine pointer and
a ref for it.  When a tracee dies, it uses UTRACE_DETACH and so drops
the implicit engine ref, leaving only the ptrace_context's ref.  The
tracer cleans this all up when it fake-reaps the zombie in wait.  

There is whatever non-utrace magic to turn off the wait interference,
and also the de_thread() race vs wait interference.  But those really
are special to ptrace.  For de_thread() you could do something hairy
with report_reap, or you could just handle it lazily with a task ref and
EXIT_DEAD checks to clean up and ignore it in the next wait.  (So all
the MT exec zombies will be released but not __put_task_struct until the
debugger waits after the PTRACE_EVENT_EXEC/SIGTRAP stop.  Or you could
even make the debugger see all the "zombies" that MT-exec now robs it
of, since their ->exit_code is all it was really going to get anyway and
you can still see it just as well in EXIT_DEAD.  I guess that's not
really any good as a feature if the pid could be reused.)

In the general case, I would expect asynchronous controllers to be going
from their own data structures to utrace_engine + task_struct/struct pid
rather than the other way around.  But you can still use a utrace_attach
lookup if you like.  (It seems handy because it's behind an API, but
you're really just choosing O(n) in the number of unrelated engines on
that task vs O(n) in the number of tracees walking your own data
structure.  That's probably the right choice for ptrace, but you always
have lots of choices.)  Just don't use attachedness to govern the life
of your data structure--that's what gives you trouble.  It's true that
utrace_attach lookups won't work when the engine is no longer attached,
but it also so happens that ptrace() wants to fail with ESRCH whenever
called for a dead or otherwise detached task anyway.

In general, and perhaps for ptrace, I'd expect a utrace module to have
some locking regime governing a list of tracees, and use that to control
removing and deallocating elements from that list.  If it wants to have
some of its utrace callbacks decide to both detach and remove an element
from a list, then that code would obey that locking regime to touch the
list before deallocating and returning UTRACE_DETACH.

> Sure. We can protect ->data by RCU, or add refcounting, etc. But
> this is nightmire. This is complex and (I bet) will lead to numerous
> bugs.

There are lots of ways to go about it.  Part of the point of utrace is
not to tie you down to a particular plan for that.  Fancy users can do
RCU and so forth to be really well-optimized, and we don't want to add
anything to the reporting hot paths where savvy utrace modules can't
optimize it out.  But I'm always open to adding things to make it easier
to write modules (and with fewer bugs), when it doesn't constrain the
best that sophosticated uses can do and doesn't overcomplicate the API
in general.

> I strongly believe that ->data should be freed by utrace_engine_put().
> If you have the reference to engine, engine->data must be stable.

Well, that's a rule you can decide you want for your engine's handling
of its ->data, sure.  You can in fact do that already, since you really
are in control of all the gets and puts.  There is one put that's
implicit in detach, but you always either choose to detach or at least
could be notified.  All other gets and puts are what your engine's code
chose to do.

> So, this patch adds engine->dtor and utrace_engine_set_dtor() helper.

The proper interface for this is just to add .release to utrace_engine_ops.
Because of how we fiddle the fields in detach, we'd still need to put the
function pointer into struct utrace_engine at some point in the
implementation.  But the API doesn't really need to know about that.

I would prefer to find a way to implement it without adding another word
to struct utrace_engine.  But it's probably not worth the effort.
Perhaps one day we'll rethink the reporting races and find a way to
handle detach without changing ->ops (rely on list_empty() or something).
But we can worry about that much later, if at all.

> Afaics, unless a module intentionally leaks utrace_engine, it can safely
> use ->dtor() which lives in ->module_core. But at least it can always
> use ->dtor = kfree.

I've never really figured out all the issues about module unloading for
utrace.  I assume that's what you're referring to here.  The two obvious
safe paths would seem to be either holding module refs whenever an
engine is attached (or now, until your release hook is called), or
synchronously detaching all engines (plus utrace_barrier et al) in the
module_exit function.


Thanks,
Roland

Reply via email to