Jack O'Connor <[email protected]> added the comment:
I'm late to the party, but I want to explain what's going on here in case it's
helpful to folks. The issue you're seeing here has to do with whether a child
processs has been "reaped". (Windows is different from Unix here, because the
parent keeps an open handle to the child, so this is mostly a Unix thing.) In
short, when a child exits, it leaves a "zombie" process whose only job is to
hold some metadata and keep the child's PID reserved. When the parent calls
wait/waitpid/waitid or similar, that zombie process is cleaned up. That means
that waiting has important correctness properties apart from just blocking the
parent -- signaling after wait returns is unsafe, and forgetting to wait also
leaks kernel resources.
Here's a short example demonstrating this:
```
import signal
import subprocess
import time
# Start a child process and sleep a little bit so that we know it's exited.
child = subprocess.Popen(["true"])
time.sleep(1)
# Signal it. Even though it's definitely exited, this is not an error.
os.kill(child.pid, signal.SIGKILL)
print("signaling before waiting works fine")
# Now wait on it. We could also use os.waitpid or os.waitid here. This reaps
# the zombie child.
child.wait()
# Try to signal it again. This raises ProcessLookupError, because the child's
# PID has been freed. But note that Popen.kill() would be a no-op here,
# because it knows the child has already been waited on.
os.kill(child.pid, signal.SIGKILL)
```
With that in mind, the original behavior with communicate() that started this
bug is expected. The docs say that communicate() "waits for process to
terminate and sets the returncode attribute." That means internally it calls
waitpid, so your terminate() thread is racing against process exit. Catching
the exception thrown by terminate() will hide the problem, but the underlying
race condition means your program might end up killing an unrelated process
that just happens to reuse the same PID at the wrong time. Doing this properly
requires using waitid(WNOWAIT), which is...tricky.
----------
nosy: +oconnor663
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue40550>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
