Oh, that's definitely part of the problem, but that is *far* beyond my ability to fix. Right now I'm still working on getting its owners to do things like "could you please log somewhere when you kill jobs, and maybe even indicate why the job was killed?".
The main time that signals show up in life outside of writing device drivers and the like is when implementing or interacting with runtime environments, basically, as they're the mechanism of interruptive interprocess communication, noncooperative scheduling control, and so on. Typical horrifying example: One of the systems I've do control is the shell that runs cron jobs (not their scheduling, but their actual execution) which needs to provide an outer harness that manages getting commands from the scheduler, integration with all sorts of logging and monitoring systems, etc., and needs to actually execute the Python code of the real jobs inside it. It needs to do various kinds of noncooperative scheduling to those subtasks (timeouts, killing and replacing workers under various circumstances, etc) and so runs them in a subprocess. So I get several layers of signals: incoming ones from the SIGTERM-happy outer runtime environment (GCP), ones from the outer runner harness to the inner jobs, and the logic in the inner jobs. And alas, the logic in some of the inner jobs has to make fundamentally non-idempotent, state-changing requests over API's to 3P systems that I don't control, and which if terminated leave the 3P system in an indeterminate and undeterminable state. Which means that if the cron job gets terminated in the middle of that API request, the system ends up in an unknown state, and whatever you do to get it into a known state will be wrong (leading to user-visible bad behavior) half the time. And because its final state can't be determined from its own API, and it can't be invoked idempotently, you can't even use a 2-phase commit approach to protect that. But it turns out that signal suppression does actually make this problem go away enough to be manageable in prod. Except that the code now has to be changed from single-threaded to multi-threaded for various other reasons, and so signal suppression by changing the signal handlers and then changing them back no longer works. So that's an example of why you might find yourself in such a situation in userland. And overall, Python's signal handling mechanism is pretty good; it's *way* nicer than having to deal with it in C, since signal handlers run in the main thread as more-or-less ordinary Python code, and you don't have to deal with the equivalent of signal-safety and the like. The downside of that flexibility, though, is that some tasks like deferring signals end up being *really hard* in the Python layer, because even appending the signum to an array isn't atomic enough to guarantee that it won't be interrupted by another signal. On Thu, Jun 25, 2020 at 5:43 PM Bernardo Sulzbach < berna...@bernardosulzbach.com> wrote: > On Thu, Jun 25, 2020 at 5:09 PM Yonatan Zunger via Python-ideas < > python-ideas@python.org> wrote: > >> Hey everyone, >> >> I've been developing code which (alas) needs to operate in a runtime >> environment which is quite *enthusiastic* about sending SIGTERMs and the >> like, and where there are critical short sections of code that, if >> interrupted, are very hard to resume without some user-visible anomaly >> happening. >> > > I find, for reasons you have already mentioned, having a "suppress all > signals" something _really_ strange in userland code. But maybe I just have > never seen a case in which it makes sense. Are you sure that the problem > isn't "a runtime environment which is quite enthusiastic about sending > SIGTERMs"? > -- Yonatan Zunger Distinguished Engineer and Chief Ethics Officer He / Him zun...@humu.com 100 View St, Suite 101 Mountain View, CA 94041 Humu.com <https://www.humu.com> · LinkedIn <https://www.linkedin.com/company/humuhq> · Twitter <https://twitter.com/humuinc>
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AZY5OCDWJNRDCGKRURLMRATHTWQPOBQY/ Code of Conduct: http://python.org/psf/codeofconduct/