Date:        Mon, 11 May 2020 11:03:15 +0000
    From:        "Kamil Rytarowski" <ka...@netbsd.org>
    Message-ID:  <20200511110315.54b13f...@cvs.netbsd.org>

  | Do not fail when trying to kill a dying process
  |
  | A dying process can disappear for a while. Rather than aborting, retry
  | sending SIGKILL to it.

I don't understand this ... a process should never be able to
disappear and then reappear (not in any way).   If a SIGKILL (or
ptrace(PT_KILL) fails with a "no such process" error, then repeating
it won't (or shouldn't) help - if it does, there's a kernel bug that
needs fixing (and it is OK for the test to fail until that happens.)

Further, if the reason for this failure is that the process is
dying, you probably never needed the kill in the first place (and
no, I don't mean it should be deleted - the parent is unlikely to
know the state of the child, so killing it, if that is what is needed
is the right thing to do ... just that if the kill fails because you
were too late issuing it, it isn't an error, just a race that you lost,
and certainly shouldn't be repeated).

But more than that, adding an infinite loop to the test, where you keep
doing the kill forever until it succeeds, or errno somehow stops being
ESRCH looks like a recipe for disaster.

Just do the kill once, ignore the error if it is ESRCH (and probably
also ECHILD) report other errors as failures.

kre

Reply via email to