Re: [PATCH 3/4] utrace_set_events: fix UTRACE_EVENT(REAP) case
On 08/19, Roland McGrath wrote: On 08/16, Roland McGrath wrote: Now that you merged c93fecc925ea7567168f0c94414b9021de2708c5 get_utrace_lock() must not succeed if utrace-reap == T, this becomes a bit off-topic. However, I thought about relaxing the dead check in get_utrace_lock(), instead of utrace-reap we could check utrace-reap !utrace-death. In fact, initially I was going to do this, but then decided to make the simpler patch for now. When utrace-reap is set, it means release_task() has been called. The API caller has no way to know reaping hasn't already begun--except if its report_death callback has not returned yet. No! the task can be already reaped even before -report_death() was called. The task_struct itself can't go away, but that is all (perhaps you meant this). And this is the problem for ugdb. Damn, I knew this, that is why ugdb_process_exit() doesn't try to read -group_exit_code. Still I managed to forget that this has other implications. But I think that the API definition of once release_task() has been called (i.e. entered, maybe not returned yet), any utrace call will get -ESRCH, is a clear and comprehensible constraint. We should not open any holes in that. I am not sure. I had another reason in mind to check reap !death in get_utrace_lock(), synchronization with -report_death() callback. But OK. Let's forget about more utrace changes for now. Oleg.
Re: [PATCH 3/4] utrace_set_events: fix UTRACE_EVENT(REAP) case
On 08/16, Roland McGrath wrote: - It is possible that both -death and -reap are true. In this case it is OK to clear UTRACE_EVENT(REAP), but set_events fails. No, it's not OK to clear it. Once -reap is set, then the engine's ops-report_reap might or might not have been called already. Afaics - no. If utrace-death is set (and we check it under utrace-lock) we can ignore utrace-reap. In short, if ops-report_reap can be called before -death is cleared, then 2 possible callers of utrace_maybe_reap() can race with each other, but this can't happen. Right. If -death is still set, it means utrace_report_death is still running. So the utrace internals know that the report_reap callbacks haven't started yet. But from the utrace API perspective, all you can know is that release_task() has already been called. So I think it's right for the API not to let you clear UTRACE_EVENT(REAP) at that point. utrace-death == T means: - (utrace_flags _UTRACE_DEATH_EVENTS) == T Yes. - utrace-death was set by utrace_report_death() which will take utrace-lock later and clear -death, only then it may call ops-report_reap(). Yes. - until utrace_report_death() clears -death, _UTRACE_DEATH_EVENTS must be set in -utrace_flags, otherwise utrace_maybe_reap(true) is buggy. Yes. Note that both utrace-death and _UTRACE_DEATH_EVENTS are cleared atomically from utrace-lock pov. Yes. IOW. utrace-death is true, then IF (utrace-reap) tracehook_prepare_releas()-utrace_maybe_reap(true) was already called, this is how utrace-reap was set. Meaning release_task() was called--semantically reaping may have begun. But utrace_maybe_reap() did nothing and returned. We rely on the subsequent utrace_maybe_reap(false) from utrace_report_death() - but this can't happen until we drop utrace-lock Correct. ELSE utrace-reap can't be set until we drop utrace-lock. Correct. Now that you merged c93fecc925ea7567168f0c94414b9021de2708c5 get_utrace_lock() must not succeed if utrace-reap == T, this becomes a bit off-topic. However, I thought about relaxing the dead check in get_utrace_lock(), instead of utrace-reap we could check utrace-reap !utrace-death. In fact, initially I was going to do this, but then decided to make the simpler patch for now. When utrace-reap is set, it means release_task() has been called. The API caller has no way to know reaping hasn't already begun--except if its report_death callback has not returned yet. But I think that the API definition of once release_task() has been called (i.e. entered, maybe not returned yet), any utrace call will get -ESRCH, is a clear and comprehensible constraint. We should not open any holes in that. Thanks, Roland
Re: [PATCH 3/4] utrace_set_events: fix UTRACE_EVENT(REAP) case
On 08/16, Roland McGrath wrote: - It is possible that both -death and -reap are true. In this case it is OK to clear UTRACE_EVENT(REAP), but set_events fails. No, it's not OK to clear it. Once -reap is set, then the engine's ops-report_reap might or might not have been called already. Afaics - no. If utrace-death is set (and we check it under utrace-lock) we can ignore utrace-reap. In short, if ops-report_reap can be called before -death is cleared, then 2 possible callers of utrace_maybe_reap() can race with each other, but this can't happen. utrace-death == T means: - (utrace_flags _UTRACE_DEATH_EVENTS) == T - utrace-death was set by utrace_report_death() which will take utrace-lock later and clear -death, only then it may call ops-report_reap(). - until utrace_report_death() clears -death, _UTRACE_DEATH_EVENTS must be set in -utrace_flags, otherwise utrace_maybe_reap(true) is buggy. Note that both utrace-death and _UTRACE_DEATH_EVENTS are cleared atomically from utrace-lock pov. IOW. utrace-death is true, then IF (utrace-reap) tracehook_prepare_releas()-utrace_maybe_reap(true) was already called, this is how utrace-reap was set. But utrace_maybe_reap() did nothing and returned. We rely on the subsequent utrace_maybe_reap(false) from utrace_report_death() - but this can't happen until we drop utrace-lock ELSE utrace-reap can't be set until we drop utrace-lock. Now that you merged c93fecc925ea7567168f0c94414b9021de2708c5 get_utrace_lock() must not succeed if utrace-reap == T, this becomes a bit off-topic. However, I thought about relaxing the dead check in get_utrace_lock(), instead of utrace-reap we could check utrace-reap !utrace-death. In fact, initially I was going to do this, but then decided to make the simpler patch for now. Oleg.
[PATCH 3/4] utrace_set_events: fix UTRACE_EVENT(REAP) case
I am not sure this fix is really needed, up to you. But please note that this code if ((utrace-death (cleared _UTRACE_DEATH_EVENTS)) || (utrace-reap (cleared UTRACE_EVENT(REAP { spin_unlock(utrace-lock); return -EALREADY; } is not exactly right, it has 2 minor problems: - It is possible that both -death and -reap are true. In this case it is OK to clear UTRACE_EVENT(REAP), but set_events fails. - OTOH, if -reap is true, set_events disallows clear but allows set, the latter case should be forbidden too. If utrace_set_events(UTRACE_EVENT(REAP)) succeeds, the caller has all rights to expect -report_reap() will be caller later. If you take this patch, then please consider the next one. Signed-off-by: Oleg Nesterov o...@redhat.com --- kernel/utrace.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) --- kstub/kernel/utrace.c~3_minor_death_reap_fix2010-08-14 04:30:28.0 +0200 +++ kstub/kernel/utrace.c 2010-08-14 04:30:29.0 +0200 @@ -541,12 +541,16 @@ int utrace_set_events(struct task_struct /* If -death or -reap is true we must see exit_state != 0. */ if (target-exit_state) { - unsigned long cleared = (old_flags ~events); - - if ((utrace-death (cleared _UTRACE_DEATH_EVENTS)) || - (utrace-reap (cleared UTRACE_EVENT(REAP { - spin_unlock(utrace-lock); - return -EALREADY; + if (utrace-death) { + if ((old_flags ~events) _UTRACE_DEATH_EVENTS) { + spin_unlock(utrace-lock); + return -EALREADY; + } + } else if (utrace-reap) { + if ((old_flags ^ events) UTRACE_EVENT(REAP)) { + spin_unlock(utrace-lock); + return -EALREADY; + } } }