Re: [RFC, PATCH] teach utrace to destroy engine-data

2009-09-04 Thread Roland McGrath
 Simple answer. Because I do not know how to implement this. At least now.
 I tried to think of this more, but I don't see how to make the first steps.
 
 (Yes, to be honest, this looks like unnecessary complication to me, I have
  to admit. But this is not the reason.)

Well, I am befuddled.  I explained the model and it seems obvious to me how
you'd go about it, the only nontrivial corner being coping with the MT exec
reap case.  But I won't let our mutual befuddlement stand in the way of
making immediate progress the way you like best.

 Agreed, but nobody else cares ;)

Ananth, Srikar, Jim, Frank?  (Beuller?)  We do have some people around
here with some experience dealing with the API.  They made me add the
engine ref counts in the first place, for Pete's sake.  They've got to
have some opinions!

 So, I am going to use the simple ops-release method for now. Once we
 have the working code we can reconsider the lifetime rules for engine/
 engine-data.

Agreed.


Thanks,
Roland



Re: [RFC, PATCH] teach utrace to destroy engine-data

2009-09-04 Thread Ananth N Mavinakayanahalli
On Fri, Sep 04, 2009 at 02:55:29AM -0700, Roland McGrath wrote:
  Simple answer. Because I do not know how to implement this. At least now.
  I tried to think of this more, but I don't see how to make the first steps.
  
  (Yes, to be honest, this looks like unnecessary complication to me, I have
   to admit. But this is not the reason.)
 
 Well, I am befuddled.  I explained the model and it seems obvious to me how
 you'd go about it, the only nontrivial corner being coping with the MT exec
 reap case.  But I won't let our mutual befuddlement stand in the way of
 making immediate progress the way you like best.
 
  Agreed, but nobody else cares ;)
 
 Ananth, Srikar, Jim, Frank?  (Beuller?)  We do have some people around
 here with some experience dealing with the API.  They made me add the
 engine ref counts in the first place, for Pete's sake.  They've got to
 have some opinions!

I am not sure if all of what I say here makes any sense as I am still in
the midst of understanding the discussion threads between the two of you
(admittedly, its one hairy piece of the kernel). However, here goes:

I prefer Roland's ops-release approach. However, is there any chance
that -ops is no longer valid at the time __utrace_release_engine() is
called? For the ptrace case at least, it'll always be valid, so there
are no issues.

The concept of how utrace engines use refcnts initially befuddled me.
The entity that does a utrace_attach_*, without it having to explicitly
ask for a utrace_engine_get(), already owns a reference. That was
counter intuitive. (That's the reason for a series of Srikar's
utrace_engine_put() fixes too, I guess).

That apart, the problem of freeing -data is similar to what Prasad is
trying to grapple in hardware watchpoints and its ptrace interaction. I
guess its a problem unique to ptrace(?). The issue there, is that when
ptrace did a register_user_hardware_watchpoint(), and that thread later
died/exec'd, ptrace may (will?) not actually unregister the watchpoint,
and the hwbkpt layer is left to fend for itself in cleaning up the
struct hw_breakpoint in flush_thread_hardware_watchpoint(). Now, you
could have non-ptrace users of the interface (who may have passed just
a reference to a statically allocated struct hw_breakpoint) in which case,
freeing the structure in the hwbkpt layer is just wrong. Right now, the
way Prasad has fixed it is to have a check if the
bp-triggered == ptrace_triggered, in which case, free up the structure.

Yes, there is a kind of layering violation here -- hwbkpt now knows
about ptrace_triggered. But I am not sure if there is a better way of
fixing it.

Ananth



Re: [RFC, PATCH] teach utrace to destroy engine-data

2009-09-04 Thread Oleg Nesterov
On 09/04, Ananth N Mavinakayanahalli wrote:

 I prefer Roland's ops-release approach.

Yes! me too.

 However, is there any chance
 that -ops is no longer valid

Sure, -ops can be changed at any time. For example, just because of
UTRACE_DETACH.

That is why utrace_attach_task() saves -release pointer in engine.
But this is transparent for the user, and engine-release is private.
We can change the implementation so that -ops is never changed.

 The concept of how utrace engines use refcnts initially befuddled me.
 The entity that does a utrace_attach_*, without it having to explicitly
 ask for a utrace_engine_get(), already owns a reference. That was
 counter intuitive.

Otherwise, how can we use the returned engine? The task can be killed
right after utrace_attach_ and disappear. In this case release_task()
path must do utrace_engine_put().

This looks very natural to me.

Oleg.



Re: [RFC, PATCH] teach utrace to destroy engine-data

2009-08-18 Thread Oleg Nesterov
On 08/17, Roland McGrath wrote:

 If we do add the destructor hook, I think it looks like this.
 ...
 @@ -271,6 +273,7 @@ struct utrace_engine *utrace_attach_task(
   engine-flags = 0;
   engine-ops = ops;
   engine-data = data;
 + engine-release = ops-release;

Agreed! this is much better.

Oleg.



Re: [RFC, PATCH] teach utrace to destroy engine-data

2009-08-18 Thread Roland McGrath
  (or even just report_quiesce
  when passed UTRACE_EVENT(DEATH)) that always detaches.
 
 Hmm. Not sure I understand how can we do without hooking death or reap...

UTRACE_EVENT(QUIESCE) constitutes hooking death.
A report_quiesce (UTRACE_EVENT(DEATH)) call precedes the report_death call.

  synchronizing with de_thread(), which is the one place that
  ordinarily preempts the ptrace wait interference.)
 
 Not only de_thread(). If the tracee is killed, another thread can
 do wait()-release_task(). This means even _reap (instead of _death)
 can't save us.

That really has nothing to do with utrace.  That's another you path that
you control and can make do what you need it to.  But, it is immaterial to
the point I made: you can use report_reap as the means to interlock with
that too if you so chose.  It seems like a goofy choice for that, since it
would be your code-release_task()-utrace_release_task()-your report_reap()
and you might as well just do your code-your code.

  I see two basic models for handling the engine lifetime issues in a utrace
  engine.
 
  One way is where you do not normally hold a utrace_engine reference.
[...]
 I must have missed something.
 
 Sure, it is more or less trivial to free engine-data safely. I mean,
 it is easy to avoid the double-free/etc problems.
 
 But I can't understand how the tracer can use -data safely. Whatever
 the tracer does, the tracee can be killed at any moment. It can
 pass -report_death/reap events under us.

What you missed is where I said this model is appropriate for situations
where you don't need to do that:

   This model is
  probably a good fit for passive tracing, where the only thing you
  really come back and do later is disable it (or for something simple
  enough it doesn't need any cleanup for engine-data).

  The second model is where you normally hold a utrace_engine reference
[...]
 And this means we need some hairy tricks.

That is an utterly vague blanket statement to make.

 Yes sure. If you don't apply your patch which uses ops-release(), then I
 am going to add engine-data-in_use for reports which return UTRACE_DETACH
 or -report_death. If the tracer needs engine-data it should use
 ptrace_get_context/ptrace_put_context. I am not sure we can avoid taking
 utrace-lock for that...

I don't really follow this at all.  
Your ptrace code has nothing to do with utrace-lock, and never will.


Thanks,
Roland



Re: [RFC, PATCH] teach utrace to destroy engine-data

2009-08-17 Thread Roland McGrath
 Now. we should make sure that every user of UTRACE_DETACH frees
 engine-data. And, we have to introduce ptrace_report_reap() which
 should free it too.

Right, or you can have a report_death callback (or even just report_quiesce
when passed UTRACE_EVENT(DEATH)) that always detaches.  In that event you
don't really need report_reap.  (In theory, this is how ptrace would be.
It doesn't have any use for utrace once a task has died.  But in practice
it depends entirely on how the wait interference is done--that is outside
of utrace proper, but you could conceivably want to use report_reap as the
mechanism for synchronizing with de_thread(), which is the one place that
ordinarily preempts the ptrace wait interference.)

 Note that this itself adds some complications. But the real problem
 is: the tracer can never safely dereference engine-data, the tracee
 can be killed/released at any moment.

I see two basic models for handling the engine lifetime issues in a utrace
engine.

One way is where you do not normally hold a utrace_engine reference.
You get an initial one from UTRACE_ATTACH_CREATE when you do setup.
Once you have done utrace_set_events() and maybe utrace_control(), you
do utrace_engine_put() immediately.  Then if you have callbacks that
use UTRACE_DETACH (or else have report_reap, where it's implicit) they
are responsible for engine-data.  And indeed you must have some such
if you have an engine-data pointer that needs cleanup.  This model is
probably a good fit for passive tracing, where the only thing you
really come back and do later is disable it (or for something simple
enough it doesn't need any cleanup for engine-data).  Then you can do:

engine = utrace_attach_task(task, UTRACE_ATTACH_MATCH_OPS,
   my_ops, NULL);
if (IS_ERR(engine)) ...
ret = utrace_control(task, engine, UTRACE_DETACH);
if (ret) ...
clean_up(engine-data);

where -ESRCH/-EALREADY tell you that your report_reap/report_death
(also counts for report_quiesce in the report_death pass) either will
run, has run, or is running, so in that case you rely on them to do:

clean_up(engine-data);
return UTRACE_DETACH;

The logic is the same for disable rather than detach, using
utrace_set_events(task, engine, 0) rather than utrace_control().  
The same return values tell you if there is a death race.  In either
case, you'd follow up with utrace_barrier() for -EINPROGRESS unless
the only kind of race that's of concern for your case is a death race
(such as when you're sure it was stopped in a place where you could
get no other callbacks, or you don't have other callbacks anyway).

The second model is where you normally hold a utrace_engine reference
to represent the life of your data structure, separate from the
implicit utrace_engine reference for being attached.  The engine uses
the rule that the code that drops the last engine reference cleans up
the data structure, rather than the code that detaches doing that.
You might have whatever callbacks that decide to use UTRACE_DETACH if
that's appropriate.  But these would not necessarily free your data
structures.  Instead, the control path does that well after detach,
when it does the final utrace_engine_put().

This model seems like a better fit for ptrace to me.  The anchoring data
structure is the tracer's tracees list, which links together the
ptrace_context structs.  A live ptrace_context has an engine pointer and
a ref for it.  When a tracee dies, it uses UTRACE_DETACH and so drops
the implicit engine ref, leaving only the ptrace_context's ref.  The
tracer cleans this all up when it fake-reaps the zombie in wait.  

There is whatever non-utrace magic to turn off the wait interference,
and also the de_thread() race vs wait interference.  But those really
are special to ptrace.  For de_thread() you could do something hairy
with report_reap, or you could just handle it lazily with a task ref and
EXIT_DEAD checks to clean up and ignore it in the next wait.  (So all
the MT exec zombies will be released but not __put_task_struct until the
debugger waits after the PTRACE_EVENT_EXEC/SIGTRAP stop.  Or you could
even make the debugger see all the zombies that MT-exec now robs it
of, since their -exit_code is all it was really going to get anyway and
you can still see it just as well in EXIT_DEAD.  I guess that's not
really any good as a feature if the pid could be reused.)

In the general case, I would expect asynchronous controllers to be going
from their own data structures to utrace_engine + task_struct/struct pid
rather than the other way around.  But you can still use a utrace_attach
lookup if you like.  (It seems handy because it's behind an API, but
you're really just choosing O(n) in the number of unrelated engines on
that task vs O(n) in the number of tracees walking your own data
structure.  That's probably the right choice for ptrace, but you always
have lots of choices.)  Just don't use