Re: [PATCH 2/3] fix utrace_control(DETACH) utrace-death interaction

2010-08-18 Thread Oleg Nesterov
On 08/17, Oleg Nesterov wrote:

 On 08/16, Roland McGrath wrote:
 
   The problem is, utrace_control(DETACH) does nothing and returns
   -EALREADY if utrace-death is set, this is not right. We can and
   should detach in this case, we only should skip utrace_reset() to
   avoid the race with utrace_report_death()-REPORT_CALLBACKS().
 
  This behavior is the original (minimal) synchronization scheme from before
  we had utrace_barrier.  See Interlock with final callbacks in
  Documentation/DocBook/utrace.tmpl.

 OK, I agree my patch breaks this part

   Normally
   functionutrace_control/function called with
   constantUTRACE_DETACH/constant returns zero, and this means that no
   more callbacks will be made.

 of documentation (which I never read ;)

Wait. It doesn't break this. It only breaks -EALREADY contract. And
I don't understand why this is bad.

 But in any case, personally I dislike the current behaviour anyway,
 I think this certainly complicates the life for module writers.
 Instead of simple detach + barrier, you always need the nontrivial
 code if report_quiesce/death ever touches engine-data! This can't
 be good.

Yes.

Roland, could you explain once again why do you dislike this patch?


Once again. Currently utrace_control(DETACH) refuses to even try to
detach the engine if utrace-death is set.

Why? What is the point? What makes UTRACE_EVENT(DEATH) so special?
I do not see the logic at all.

If -report_death() does something which the caller of utrace_control()
should know, then they should take care of synchronization anyway.
With or without this patch, utrace_control(DEATH) can return 0 after
-report_death() was already called.


Currently, utrace_control(DETACH) returns -EALREADY when this callback
was alredy called, or it can be called later, or it may be running. And
utrace_barrier() can't help.

With this patch utrace_control(DETACH) returns either 0 with the same
meaning (will not be called later, but probably was already called
before utrace_control), or -EINPROGRESS which suggests to use
utrace_barrier().


Please correct me, but I think this certainly makes things much simpler.
Otherwise UTRACE_DETACH is never trivial (and iiuc it is better to
avoid utrace_engine_ops-release() hook).

What do you think?

Oleg.



[PATCH] fix mark_engine_detached() vs start_callback() race

2010-08-18 Thread Oleg Nesterov
Suppose that engine-flags == UTRACE_EVENT(EXEC), QUIESCE bit is not set.

1. start_callback() reads want = engine-flags (== EXEC)

2. mark_engine_detached() sets engine-ops = utrace_detached_ops

3. start_callback() gets ops = utrace_detached_ops

After that start_callback() skips if (want  UTRACE_EVENT(QUIESCE))
block and returns utrace_detached_ops, then -report_exec == NULL
will be called.

This is the minimal temporary ugly fix for now, we should certainly
cleanup and simplify this logic. The barriers in mark_engine_detached()
and in start_callback() can't help and should be removed. If we ignore
utrace_get_signal() we do not even need utrace_detached_quiesce(),
start_callback() could just do

ops = engine-ops;

if (ops == utrace_detached_ops) {
report-detaches = true;
return NULL;
}

I think in the longer term mark_engine_detached() should not change
engine-flags at all but add QUIESCE to -utrace_flags. However, this
breaks utrace_maybe_reap(reap = true) and we should avoid the race
with finish_callback() which clears -reporting after report_quiesce().

A bit off-topic, but I don't think finish_callback() should check
engine-ops == utrace_detached_ops before return. Instead we should
change finish_callback_report() to return the boolean. We shouldn't
return true without setting report-detaches.

Signed-off-by: Oleg Nesterov o...@redhat.com
---

 kernel/utrace.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- kstub/kernel/utrace.c~8_fix_mark_detached_without_quiesce   2010-08-18 
16:46:08.0 +0200
+++ kstub/kernel/utrace.c   2010-08-18 17:47:53.0 +0200
@@ -1522,7 +1522,7 @@ static const struct utrace_engine_ops *s
smp_rmb();
ops = engine-ops;
 
-   if (want  UTRACE_EVENT(QUIESCE)) {
+   if ((want  UTRACE_EVENT(QUIESCE)) || ops == utrace_detached_ops) {
if (finish_callback(task, utrace, report, engine,
(*ops-report_quiesce)(report-action,
   engine, event)))



[PATCH?] avoid the unnecessary utrace_resume()-utrace_reset()

2010-08-18 Thread Oleg Nesterov
utrace_resume(UTRACE_REPORT) always calls utrace_reset() because
start_callback() obviously can't clear report-spurious when
event == 0.

Change start_callback() to correctly clear -spurious in this case.
We could probably clear it in utrace_resume() unconditionally after
reporting loop, but this is not exactly right if there are no engines
which have QUIESCE in engine-flags.

utrace_get_signal() probably has the same problem.

Note: utrace_control(DETACH) does utrace_do_stop() and sets UTRACE_REPORT
if the tracee is not stopped. It also does mark_engine_detached() which
does not set QUIESCE in target-utrace_flags. This means we rely on
report.spurious which should provoke utrace_reset() from utrace_resume()
if target-utrace_flags doesn't have QUIESCE. A bit too subtle, imho.
Also, UTRACE_REPORT can be lost because of UTRACE_INTERRUPT or normal
signal: utrace_get_signal() checks utrace_flags  UTRACE_EVENT(QUIESCE)
and returns otherwise. This should be fixed somehow. This check is wrong
anyway, but it is not clear how we can fix the race with DETACH.

Signed-off-by: Oleg Nesterov o...@redhat.com
---

 kernel/utrace.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- kstub/kernel/utrace.c~10_utrace_resume_and_spurious 2010-08-18 
19:00:50.0 +0200
+++ kstub/kernel/utrace.c   2010-08-18 19:41:05.0 +0200
@@ -1540,7 +1540,7 @@ static const struct utrace_engine_ops *s
if (want  ENGINE_STOP)
report-action = UTRACE_STOP;
 
-   if (want  event) {
+   if (want  (event ?: UTRACE_EVENT(QUIESCE))) {
report-spurious = false;
return ops;
}