Re: [PATCH 2/3] fix utrace_control(DETACH) utrace-death interaction
On 08/17, Oleg Nesterov wrote: On 08/16, Roland McGrath wrote: The problem is, utrace_control(DETACH) does nothing and returns -EALREADY if utrace-death is set, this is not right. We can and should detach in this case, we only should skip utrace_reset() to avoid the race with utrace_report_death()-REPORT_CALLBACKS(). This behavior is the original (minimal) synchronization scheme from before we had utrace_barrier. See Interlock with final callbacks in Documentation/DocBook/utrace.tmpl. OK, I agree my patch breaks this part Normally functionutrace_control/function called with constantUTRACE_DETACH/constant returns zero, and this means that no more callbacks will be made. of documentation (which I never read ;) Wait. It doesn't break this. It only breaks -EALREADY contract. And I don't understand why this is bad. But in any case, personally I dislike the current behaviour anyway, I think this certainly complicates the life for module writers. Instead of simple detach + barrier, you always need the nontrivial code if report_quiesce/death ever touches engine-data! This can't be good. Yes. Roland, could you explain once again why do you dislike this patch? Once again. Currently utrace_control(DETACH) refuses to even try to detach the engine if utrace-death is set. Why? What is the point? What makes UTRACE_EVENT(DEATH) so special? I do not see the logic at all. If -report_death() does something which the caller of utrace_control() should know, then they should take care of synchronization anyway. With or without this patch, utrace_control(DEATH) can return 0 after -report_death() was already called. Currently, utrace_control(DETACH) returns -EALREADY when this callback was alredy called, or it can be called later, or it may be running. And utrace_barrier() can't help. With this patch utrace_control(DETACH) returns either 0 with the same meaning (will not be called later, but probably was already called before utrace_control), or -EINPROGRESS which suggests to use utrace_barrier(). Please correct me, but I think this certainly makes things much simpler. Otherwise UTRACE_DETACH is never trivial (and iiuc it is better to avoid utrace_engine_ops-release() hook). What do you think? Oleg.
[PATCH] fix mark_engine_detached() vs start_callback() race
Suppose that engine-flags == UTRACE_EVENT(EXEC), QUIESCE bit is not set. 1. start_callback() reads want = engine-flags (== EXEC) 2. mark_engine_detached() sets engine-ops = utrace_detached_ops 3. start_callback() gets ops = utrace_detached_ops After that start_callback() skips if (want UTRACE_EVENT(QUIESCE)) block and returns utrace_detached_ops, then -report_exec == NULL will be called. This is the minimal temporary ugly fix for now, we should certainly cleanup and simplify this logic. The barriers in mark_engine_detached() and in start_callback() can't help and should be removed. If we ignore utrace_get_signal() we do not even need utrace_detached_quiesce(), start_callback() could just do ops = engine-ops; if (ops == utrace_detached_ops) { report-detaches = true; return NULL; } I think in the longer term mark_engine_detached() should not change engine-flags at all but add QUIESCE to -utrace_flags. However, this breaks utrace_maybe_reap(reap = true) and we should avoid the race with finish_callback() which clears -reporting after report_quiesce(). A bit off-topic, but I don't think finish_callback() should check engine-ops == utrace_detached_ops before return. Instead we should change finish_callback_report() to return the boolean. We shouldn't return true without setting report-detaches. Signed-off-by: Oleg Nesterov o...@redhat.com --- kernel/utrace.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- kstub/kernel/utrace.c~8_fix_mark_detached_without_quiesce 2010-08-18 16:46:08.0 +0200 +++ kstub/kernel/utrace.c 2010-08-18 17:47:53.0 +0200 @@ -1522,7 +1522,7 @@ static const struct utrace_engine_ops *s smp_rmb(); ops = engine-ops; - if (want UTRACE_EVENT(QUIESCE)) { + if ((want UTRACE_EVENT(QUIESCE)) || ops == utrace_detached_ops) { if (finish_callback(task, utrace, report, engine, (*ops-report_quiesce)(report-action, engine, event)))
[PATCH?] avoid the unnecessary utrace_resume()-utrace_reset()
utrace_resume(UTRACE_REPORT) always calls utrace_reset() because start_callback() obviously can't clear report-spurious when event == 0. Change start_callback() to correctly clear -spurious in this case. We could probably clear it in utrace_resume() unconditionally after reporting loop, but this is not exactly right if there are no engines which have QUIESCE in engine-flags. utrace_get_signal() probably has the same problem. Note: utrace_control(DETACH) does utrace_do_stop() and sets UTRACE_REPORT if the tracee is not stopped. It also does mark_engine_detached() which does not set QUIESCE in target-utrace_flags. This means we rely on report.spurious which should provoke utrace_reset() from utrace_resume() if target-utrace_flags doesn't have QUIESCE. A bit too subtle, imho. Also, UTRACE_REPORT can be lost because of UTRACE_INTERRUPT or normal signal: utrace_get_signal() checks utrace_flags UTRACE_EVENT(QUIESCE) and returns otherwise. This should be fixed somehow. This check is wrong anyway, but it is not clear how we can fix the race with DETACH. Signed-off-by: Oleg Nesterov o...@redhat.com --- kernel/utrace.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- kstub/kernel/utrace.c~10_utrace_resume_and_spurious 2010-08-18 19:00:50.0 +0200 +++ kstub/kernel/utrace.c 2010-08-18 19:41:05.0 +0200 @@ -1540,7 +1540,7 @@ static const struct utrace_engine_ops *s if (want ENGINE_STOP) report-action = UTRACE_STOP; - if (want event) { + if (want (event ?: UTRACE_EVENT(QUIESCE))) { report-spurious = false; return ops; }