Re: [PATCH 3/4] utrace_set_events: fix UTRACE_EVENT(REAP) case

2010-08-17 Thread Oleg Nesterov
On 08/16, Roland McGrath wrote:

  - It is possible that both -death and -reap are true. In this
case it is OK to clear UTRACE_EVENT(REAP), but set_events fails.

 No, it's not OK to clear it.  Once -reap is set, then the engine's
 ops-report_reap might or might not have been called already.

Afaics - no.

If utrace-death is set (and we check it under utrace-lock) we can
ignore utrace-reap.

In short, if ops-report_reap can be called before -death is cleared,
then 2 possible callers of utrace_maybe_reap() can race with each other,
but this can't happen.

utrace-death == T means:

- (utrace_flags  _UTRACE_DEATH_EVENTS) == T

- utrace-death was set by utrace_report_death() which will take
  utrace-lock later and clear -death, only then it may call
  ops-report_reap().

- until utrace_report_death() clears -death, _UTRACE_DEATH_EVENTS
  must be set in -utrace_flags, otherwise utrace_maybe_reap(true)
  is buggy.

  Note that both utrace-death and _UTRACE_DEATH_EVENTS are cleared
  atomically from utrace-lock pov.

IOW. utrace-death is true, then

IF (utrace-reap)

tracehook_prepare_releas()-utrace_maybe_reap(true) was
already called, this is how utrace-reap was set.

But utrace_maybe_reap() did nothing and returned.
We rely on the subsequent utrace_maybe_reap(false) from
utrace_report_death() - but this can't happen until
we drop utrace-lock

ELSE
utrace-reap can't be set until we drop utrace-lock.   


Now that you merged c93fecc925ea7567168f0c94414b9021de2708c5
get_utrace_lock() must not succeed if utrace-reap == T, this becomes
a bit off-topic. However, I thought about relaxing the dead check in
get_utrace_lock(), instead of utrace-reap we could check
utrace-reap  !utrace-death. In fact, initially I was going to
do this, but then decided to make the simpler patch for now.

Oleg.



Re: [PATCH 2/4] utrace_set_events: consolidate setting _UTRACE_DEATH_EVENTS checks

2010-08-17 Thread Oleg Nesterov
On 08/16, Roland McGrath wrote:

  In short: this double check allows to avoid tasklist when utrace_flags
  already has these bits while engine-flags doesn't.
 
  But multitracing is unlikely case, in the likely case if we add
  _UTRACE_DEATH_EVENTS to engine-flags we set these bits in utrace_flags
  too. I think it makes sense to consolidate these checks to make the
  code a bit more understandable.

 I don't really agree about unlikely case.  In many uses, the systemtap
 task-finder will have a utrace engine on every task in the system, for
 example.  Moreover, this is an importantly distinct particular kind of
 micro-optimization to be doing here: avoiding a system-wide lock.  Any
 place that we take tasklist_lock at all, we are introducing a system-wide
 slowdown or limitation on scaling, just because of our tracing of one task.
 So optimizing the minimize that is qualitatively different and really much
 more important than avoiding taking the utrace lock, for example.

 But with that note in your mind, I am happy to take this patch now as a
 simplification and let reoptimization come back later.

OK, agreed.

Oleg.



Re: [PATCH 0/3] UTRACE_DETACH fixes

2010-08-17 Thread Oleg Nesterov
On 08/17, Oleg Nesterov wrote:

 Suppose that, say, engine-flags == UTRACE_EVENT(EXEC) (no QUIESCE).

   1. start_callback() reads want = engine-flags (== EXEC)

   2. mark_engine_detached() sets engine-ops = utrace_detached_ops

   3. start_callback() gets ops = utrace_detached_ops

 After that start_callback() skips if (want  UTRACE_EVENT(QUIESCE)) block
 and returns utrace_detached_ops, utrace_detached_ops-report_exec == NULL
 will be called.

OK, instead of filling utrace_detached_ops{} we can change start_callback()

- if (want  UTRACE_EVENT(QUIESCE)) {
+ if ((want  UTRACE_EVENT(QUIESCE)) || ops == detached_ops) {


 But this is minor. Roland, I can't explain this, but I have the strong
 gut feeling there is something else we should worry about. I am still
 reading this code...

Or not... I'll recheck once again tomorrow and try to make the patch.

In particular, I'd like to think if we can simplify this somehow.
Say, avoid the barriers somewhere, or avoid changing engine-flags in
mark_engine_detached().

I am worried that mark_engine_detached() sets engine-flags, but it
doesn't add QUIESCE to utrace_flags. I can't convince myself that
utrace_do_stop() closes all possible races.


And, in particular, I don't understand this code in utrace_get_signal:

} else if (!(task-utrace_flags  UTRACE_EVENT(QUIESCE))) {
/*
 * We only got here to clear utrace-signal_handler.
 */
return -1;
}

How so? What if we were called because of utrace_control(INTERRUPT)
and QUIESCE is not set?

Oleg.