Re: [Python-ideas] Importance of noticing new signals

2017-11-03 Thread Antoine Pitrou
On Fri, 3 Nov 2017 17:22:07 +0200
Koos Zevenhoven  wrote:
> 
> Maybe you are concerned about whether some nuances and recent changes to
> signal handling could lead to harmful change in behavior in some meaningful
> edge cases? I can at least say that my PyErr_PROBE_SIGNALS() proposal does
> not introduce such issues, if the difference is documented properly:
> 
> """PyErr_PROBE_SIGNALS() is meant for performance-critical code and is not
> 100% guaranteed to always see the most recent signals. If a signal being
> deferred is a concern, use PyErr_CheckSignals() instead."""

Agreed.

> But more generally, if we could assume that trip_signal() and
> PyErr_CheckSignals() always happen in the same "CPU thread", then we
> wouldn't need pyatomic.h here at all. The fact that the code currently
> assumes that all Python signal handlers should run in the same Python
> thread takes care of some of these concerns without needing locks etc.

Hmm... *Python* signal handlers (registered with signal.signal()) all
run in the same thread, but we have no way to ensure to ensure that
trip_signal() (which is a C-level signal handler called by the OS) will
run in the same thread.  Even if we were to mask all signals in
non-main threads started by Python (e.g. with threading.Thread),
there can still be threads created by third-party C libraries that we
don't have any control over.

> Some other concerns I can imagine by looking at some of the code in
> Modules/signalmodule.c:
> 
> (1) If trip_signal() and PyErr_CheckSignals() are executed concurrently,
> trip_signal() might set a new signal flag (using relaxed memory order)
> while PyErr_CheckSignals is still running. Then if PyErr_CheckSignals()
> sets is_tripped to zero *after* trip_signal() sets it to 1, then the new
> signal might be deferred until the next time *some* new signal arrives,
> which could take an arbitrarily long amount of time, I suppose.
> 
> However, it looks like this problem has been solved by always setting
> is_tripped to zero (with strict SEQ_CST memory order) *before* handling the
> individual signals. So if trip_signal() has already set is_tripped to 1
> (with SEQ_CST), that prevents PyErr_CheckSignals from setting is_tripped to
> zero (with SEQ_CST) *and not* handling the signal. If trip_signal() has not
> yet finished, and therefore not set is_tripped to 1 yet, it will cause the
> next call to PyErr_CheckSignals to catch the signal.
> 
> (2) Again, if trip_signal() and PyErr_CheckSignals() execute concurrently,
> it might happen that PyErr_CheckSignals() handles the signal *before*
> trip_signal sets is_tripped to 1. That would cause the next call to
> PyErr_CheckSignals() to think there's an unhandled signal, but will most
> likely not find one, because it was already handled on the previous call.
> But that just effectively means that nothing is done. In fact, there's a
> comment in the code that mentions this.
> 
> (3, 4, ...) Of course there's more to take care of there, but that's
> unrelated to my PyErr_PROBE_SIGNALS() proposal. Anyway, at least (1) and
> (2) seem to already have been taken care of, and I assume you are aware of
> that.

Yes, that was my analysis too (and why I implemented it this way in
the first place). But I'm glad you confirm it, since on such topics
it's always easy to miss a potential problem.

Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Importance of noticing new signals

2017-11-03 Thread Koos Zevenhoven
On Thu, Nov 2, 2017 at 2:57 PM, Koos Zevenhoven  wrote:

> On Thu, Nov 2, 2017 at 2:22 PM, Antoine Pitrou 
> wrote:
>
>> On Wed, 1 Nov 2017 20:29:56 +0200
>> Koos Zevenhoven  wrote:
>> >
>> > ​From a correctness point of view, that is absolutely great: if
>> > PyErr_CheckSignals() is called, it is guaranteed to notice a new signal
>> > regardles of how small the number of picoseconds after the `is_tripped`
>> > flag has been set.  But is that really important?
>>
>> I was going to answer "no"... but actually yes.  The important case is
>> the event loop type case:
>>
>> while (1) {
>> select([some file descriptors]);
>> if (errno == EINTR) {
>> PyErr_CheckSignals();
>> if (PyErr_Occurred()) break;
>> }
>> /* continue select()ing... */
>> }
>>
>> Now say at a given point in time, no fds are actually active (or even
>> waited for), but some signal arrives (SIGINT perhaps).
>> select() is woken up and returns with errno EINTR.  Two things then can
>> happen:
>>
>> - if PyErr_CheckSignals() notices the signal, it will run the relevant
>>   signal handler, which may raise an exception and trigger the select()
>>   loop to exit (e.g. SIGINT would raise KeyboardInterrupt)
>>
>> - if PyErr_CheckSignals() misses the signal, the loop will enter again,
>>   and select() may sleep for an infinite amount of time
>>
>>
> Oh! So that would provide a proper reason for my just-in-case decision to
> name the faster near-equivalent functionality PyErr_PROBE_SIGNALS instead
> of PyErr_CHECK_SIGNALS.
>
> Cross-referencing to that (thread about making Ctrl-C "always" work):
>
> https://mail.python.org/pipermail/python-ideas/2017-November/047631.html
>
>
> Of course, what we're doing with select() above can already apply for
>> read() or other interruptible syscalls waiting for outside data... and
>> that pattern is present a lot internally, especially since
>> ​​
>> PEP 475 ("Retry system calls failing with EINTR").
>>
>
>> Now, is the "sequentially consistent" ordering on is_tripped sufficient
>> to guarantee that signals won't be missed on a weak-ordering platform?
>> I *think* so, but an expert would need to check that code (or we
>> cross our fingers and wait for a hypothetical bug report).
>>
>>
> I think the question is: Do we know for sure that is_trippe​d has been
> stored using sequentially consistent ordering prior to the call to
> PyErr_CheckSignals(), even if an interruptible syscall is involved? I
> suppose so?
>
> (But this is a separate question from the problem I was solving, of
> course. I'm not proposing to remove PyErr_CheckSignals())
>
>
To continue on this: If I understand your question correctly, I'm hesitant
to make strong statements about it. It would be interesting to know what we
can assume about signals that happen at the same time with system calls,
given the various platforms supported. Unfortunately, I don't know that.

Maybe you are concerned about whether some nuances and recent changes to
signal handling could lead to harmful change in behavior in some meaningful
edge cases? I can at least say that my PyErr_PROBE_SIGNALS() proposal does
not introduce such issues, if the difference is documented properly:

"""PyErr_PROBE_SIGNALS() is meant for performance-critical code and is not
100% guaranteed to always see the most recent signals. If a signal being
deferred is a concern, use PyErr_CheckSignals() instead."""


But more generally, if we could assume that trip_signal() and
PyErr_CheckSignals() always happen in the same "CPU thread", then we
wouldn't need pyatomic.h here at all. The fact that the code currently
assumes that all Python signal handlers should run in the same Python
thread takes care of some of these concerns without needing locks etc.

Some other concerns I can imagine by looking at some of the code in
Modules/signalmodule.c:

(1) If trip_signal() and PyErr_CheckSignals() are executed concurrently,
trip_signal() might set a new signal flag (using relaxed memory order)
while PyErr_CheckSignals is still running. Then if PyErr_CheckSignals()
sets is_tripped to zero *after* trip_signal() sets it to 1, then the new
signal might be deferred until the next time *some* new signal arrives,
which could take an arbitrarily long amount of time, I suppose.

However, it looks like this problem has been solved by always setting
is_tripped to zero (with strict SEQ_CST memory order) *before* handling the
individual signals. So if trip_signal() has already set is_tripped to 1
(with SEQ_CST), that prevents PyErr_CheckSignals from setting is_tripped to
zero (with SEQ_CST) *and not* handling the signal. If trip_signal() has not
yet finished, and therefore not set is_tripped to 1 yet, it will cause the
next call to PyErr_CheckSignals to catch the signal.

(2) Again, if trip_signal() and PyErr_CheckSignals() execute concurrently,
it might happen that PyErr_CheckSignals() handles the signal *before*
trip_signal sets is_tripped to 1. That would

Re: [Python-ideas] Importance of noticing new signals

2017-11-02 Thread Antoine Pitrou
On Thu, 2 Nov 2017 14:57:12 +0200
Koos Zevenhoven  wrote:
> > Now, is the "sequentially consistent" ordering on is_tripped sufficient
> > to guarantee that signals won't be missed on a weak-ordering platform?
> > I *think* so, but an expert would need to check that code (or we
> > cross our fingers and wait for a hypothetical bug report).
> >
> >  
> I think the question is: Do we know for sure that is_trippe​d has been
> stored using sequentially consistent ordering prior to the call to
> PyErr_CheckSignals(), even if an interruptible syscall is involved?

Yes, it is.  See trip_signal() in Modules/signalmodule.c

Regards

Antoine.


___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Importance of noticing new signals

2017-11-02 Thread Koos Zevenhoven
On Thu, Nov 2, 2017 at 2:22 PM, Antoine Pitrou  wrote:

> On Wed, 1 Nov 2017 20:29:56 +0200
> Koos Zevenhoven  wrote:
> >
> > ​From a correctness point of view, that is absolutely great: if
> > PyErr_CheckSignals() is called, it is guaranteed to notice a new signal
> > regardles of how small the number of picoseconds after the `is_tripped`
> > flag has been set.  But is that really important?
>
> I was going to answer "no"... but actually yes.  The important case is
> the event loop type case:
>
> while (1) {
> select([some file descriptors]);
> if (errno == EINTR) {
> PyErr_CheckSignals();
> if (PyErr_Occurred()) break;
> }
> /* continue select()ing... */
> }
>
> Now say at a given point in time, no fds are actually active (or even
> waited for), but some signal arrives (SIGINT perhaps).
> select() is woken up and returns with errno EINTR.  Two things then can
> happen:
>
> - if PyErr_CheckSignals() notices the signal, it will run the relevant
>   signal handler, which may raise an exception and trigger the select()
>   loop to exit (e.g. SIGINT would raise KeyboardInterrupt)
>
> - if PyErr_CheckSignals() misses the signal, the loop will enter again,
>   and select() may sleep for an infinite amount of time
>
>
Oh! So that would provide a proper reason for my just-in-case decision to
name the faster near-equivalent functionality PyErr_PROBE_SIGNALS instead
of PyErr_CHECK_SIGNALS.

Cross-referencing to that (thread about making Ctrl-C "always" work):

https://mail.python.org/pipermail/python-ideas/2017-November/047631.html


Of course, what we're doing with select() above can already apply for
> read() or other interruptible syscalls waiting for outside data... and
> that pattern is present a lot internally, especially since
> PEP 475 ("Retry system calls failing with EINTR").
>

> Now, is the "sequentially consistent" ordering on is_tripped sufficient
> to guarantee that signals won't be missed on a weak-ordering platform?
> I *think* so, but an expert would need to check that code (or we
> cross our fingers and wait for a hypothetical bug report).
>
>
I think the question is: Do we know for sure that is_trippe​d has been
stored using sequentially consistent ordering prior to the call to
PyErr_CheckSignals(), even if an interruptible syscall is involved? I
suppose so?

(But this is a separate question from the problem I was solving, of course.
I'm not proposing to remove PyErr_CheckSignals())

––Koos



> Regards
>
> Antoine.
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
+ Koos Zevenhoven + http://twitter.com/k7hoven +
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/