> On 08/16, Roland McGrath wrote: > > > > > - It is possible that both ->death and ->reap are true. In this > > > case it is OK to clear UTRACE_EVENT(REAP), but set_events fails. > > > > No, it's not OK to clear it. Once ->reap is set, then the engine's > > ops->report_reap might or might not have been called already. > > Afaics - no. > > If utrace->death is set (and we check it under utrace->lock) we can > ignore utrace->reap. > > In short, if ops->report_reap can be called before ->death is cleared, > then 2 possible callers of utrace_maybe_reap() can race with each other, > but this can't happen.
Right. If ->death is still set, it means utrace_report_death is still running. So the utrace internals know that the report_reap callbacks haven't started yet. But from the utrace API perspective, all you can know is that release_task() has already been called. So I think it's right for the API not to let you clear UTRACE_EVENT(REAP) at that point. > utrace->death == T means: > > - (utrace_flags & _UTRACE_DEATH_EVENTS) == T Yes. > - utrace->death was set by utrace_report_death() which will take > utrace->lock later and clear ->death, only then it may call > ops->report_reap(). Yes. > - until utrace_report_death() clears ->death, _UTRACE_DEATH_EVENTS > must be set in ->utrace_flags, otherwise utrace_maybe_reap(true) > is buggy. Yes. > Note that both utrace->death and _UTRACE_DEATH_EVENTS are cleared > "atomically" from utrace->lock pov. Yes. > IOW. utrace->death is true, then > > IF (utrace->reap) > > tracehook_prepare_releas()->utrace_maybe_reap(true) was > already called, this is how utrace->reap was set. Meaning release_task() was called--semantically "reaping may have begun". > But utrace_maybe_reap() did nothing and returned. > We rely on the subsequent utrace_maybe_reap(false) from > utrace_report_death() - but this can't happen until > we drop utrace->lock Correct. > ELSE > utrace->reap can't be set until we drop utrace->lock. Correct. > Now that you merged c93fecc925ea7567168f0c94414b9021de2708c5 > "get_utrace_lock() must not succeed if utrace->reap == T", this becomes > a bit off-topic. However, I thought about relaxing the "dead" check in > get_utrace_lock(), instead of utrace->reap we could check > "utrace->reap && !utrace->death". In fact, initially I was going to > do this, but then decided to make the simpler patch for now. When utrace->reap is set, it means release_task() has been called. The API caller has no way to know reaping hasn't already begun--except if its report_death callback has not returned yet. But I think that the API definition of "once release_task() has been called (i.e. entered, maybe not returned yet), any utrace call will get -ESRCH", is a clear and comprehensible constraint. We should not open any holes in that. Thanks, Roland