Oh, that's definitely part of the problem, but that is *far* beyond my
ability to fix. Right now I'm still working on getting its owners to do
things like "could you please log somewhere when you kill jobs, and maybe
even indicate why the job was killed?".

The main time that signals show up in life outside of writing device
drivers and the like is when implementing or interacting with runtime
environments, basically, as they're the mechanism of interruptive
interprocess communication, noncooperative scheduling control, and so on.

Typical horrifying example: One of the systems I've do control is the shell
that runs cron jobs (not their scheduling, but their actual execution)
which needs to provide an outer harness that manages getting commands from
the scheduler, integration with all sorts of logging and monitoring
systems, etc., and needs to actually execute the Python code of the real
jobs inside it. It needs to do various kinds of noncooperative scheduling
to those subtasks (timeouts, killing and replacing workers under various
circumstances, etc) and so runs them in a subprocess. So I get several
layers of signals: incoming ones from the SIGTERM-happy outer runtime
environment (GCP), ones from the outer runner harness to the inner jobs,
and the logic in the inner jobs.

And alas, the logic in some of the inner jobs has to make fundamentally
non-idempotent, state-changing requests over API's to 3P systems that I
don't control, and which if terminated leave the 3P system in an
indeterminate and undeterminable state. Which means that if the cron job
gets terminated in the middle of that API request, the system ends up in an
unknown state, and whatever you do to get it into a known state will be
wrong (leading to user-visible bad behavior) half the time. And because its
final state can't be determined from its own API, and it can't be invoked
idempotently, you can't even use a 2-phase commit approach to protect that.

But it turns out that signal suppression does actually make this problem go
away enough to be manageable in prod. Except that the code now has to be
changed from single-threaded to multi-threaded for various other reasons,
and so signal suppression by changing the signal handlers and then changing
them back no longer works.


So that's an example of why you might find yourself in such a situation in
userland. And overall, Python's signal handling mechanism is pretty good;
it's *way* nicer than having to deal with it in C, since signal handlers
run in the main thread as more-or-less ordinary Python code, and you don't
have to deal with the equivalent of signal-safety and the like. The
downside of that flexibility, though, is that some tasks like deferring
signals end up being *really hard* in the Python layer, because even
appending the signum to an array isn't atomic enough to guarantee that it
won't be interrupted by another signal.


On Thu, Jun 25, 2020 at 5:43 PM Bernardo Sulzbach <
berna...@bernardosulzbach.com> wrote:

> On Thu, Jun 25, 2020 at 5:09 PM Yonatan Zunger via Python-ideas <
> python-ideas@python.org> wrote:
>
>> Hey everyone,
>>
>> I've been developing code which (alas) needs to operate in a runtime
>> environment which is quite *enthusiastic* about sending SIGTERMs and the
>> like, and where there are critical short sections of code that, if
>> interrupted, are very hard to resume without some user-visible anomaly
>> happening.
>>
>
> I find, for reasons you have already mentioned, having a "suppress all
> signals" something _really_ strange in userland code. But maybe I just have
> never seen a case in which it makes sense. Are you sure that the problem
> isn't "a runtime environment which is quite enthusiastic about sending
> SIGTERMs"?
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AZY5OCDWJNRDCGKRURLMRATHTWQPOBQY/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to