[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2021-01-12 Thread Guido van Rossum


Guido van Rossum  added the comment:

When I removed myself from the nosy list I did mean to stop working on this 
bug, just that I am unlikely to be of any help. :-)

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-23 Thread Guido van Rossum

Changes by Guido van Rossum :


--
nosy:  -gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-23 Thread Antoine Pitrou

Antoine Pitrou added the comment:

That said, the pthread_kill() solution deserves testing as well.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-23 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> Using set_wakeup_fd would fix them... except that it generates spurious 
> warning messages when the wakeup fd buffer is full, and there's no way to 
> stop it

Are you using a pipe or a socket to set_wakeup_fd?  Pipes have rather small 
buffers.  In any case, since, as you mention, Tornado and Twisted use it, and 
they never complained, perhaps your example is too contrived or artificial.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread Nathaniel Smith

Nathaniel Smith added the comment:

Letting Python-level signal handlers run in arbitrary threads is an interesting 
idea, but it's a non-trivial change to Python semantics that may well break 
some programs (previously it was guaranteed that signal handlers couldn't race 
with main thread code), and it doesn't fully fix the bug here (because it's 
common for Python programs to contain threads which don't run Python code, e.g. 
if they use zmq). I can imagine some potential benefits, but I feel like this 
needs some substantial rationale?

OTOH the pthread_kill-based fix I suggested is like 3 lines and AFAICT fixes 
the problem exactly.

> I dislike the idea of transfering the signal to the main thread from another 
> thread in the C signal handler using pthread_kill(). Most code behave very 
> badly when getting a signal, so getting a signal twice is likely to double 
> the pain.

This doesn't make much sense to me... CPython's pretty good at handling EINTR 
these days, and, well, the only case that's different is when you have a signal 
handler that needs to be run and the main thread is blocked in a syscall, which 
is exactly the case that's currently broken? 

> Moreover, pthread_sigmask() can block signals in the main thread for 
> deliberate reasons.

Sure, and with pthread_kill() this would have the effect that the signal would 
be queued by the kernel and delivered after signal is unmasked. That seems like 
pretty sensible semantics to me; in fact it's exactly what you get in 
single-threaded code.

Compare that to right now where AFAICT pthread_sigmask() is pretty useless in 
the presence of threads, because of the thing where if one thread has a signal 
blocked then the kernel will pick another thread to deliver it to.

> From the point of view of the Python signal handler, the current "if 
> (PyThread_get_thread_ident() != main_thread) return 0;" code in the C signal 
> handler is somehow an implicit pthread_sigmask(signal.SIG_BLOCK, range(1, 
> signal.NSIG)) on all threads except of the main thread, whereas Unix gives a 
> fine control on these masks with the pthread_sigmask() function.

That code isn't in the C signal handler, it's in the code that the interpreter 
runs occasionally to check if the C signal handler has been called, right?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread STINNER Victor

STINNER Victor added the comment:

I created a WIP patch https://github.com/python/cpython/pull/768 to experiment 
allowing signal handlers from any threads. I expect many race conditions since 
I only removed sanity checks without adding new code to handle parallelism.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread STINNER Victor

Changes by STINNER Victor :


--
pull_requests: +675

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread STINNER Victor

STINNER Victor added the comment:

thread_signal.py: Oops, I forgot to remove the "raise Exception" line from 
sighandler() (I used it for a quick test.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread STINNER Victor

STINNER Victor added the comment:

Hum, maybe I found the root issue: the C signal handler calls 
Py_AddPendingCall() which uses a lock to protect a global static pendingcalls 
array (of 32 items). The function tries 100 times in a row to acquire to lock, 
or does nothing (returns -1) if it fails to acquire the lock.

If we start to allow signals from any thread, this shared pendingcalls array 
can quickly become a source of race conditions like deadlocks or ignored 
callbacks.

To avoid deadlocks, IMHO the best is to have a per-thread array which consumes 
512 bytes (on 64-bit, 32 items made of 2 pointers).

--

The _thread module has a strange _thread.interrupt_main() function.

--

>From the point of view of the Python signal handler, the current "if 
>(PyThread_get_thread_ident() != main_thread) return 0;" code in the C signal 
>handler is somehow an implicit pthread_sigmask(signal.SIG_BLOCK, range(1, 
>signal.NSIG)) on all threads except of the main thread, whereas Unix gives a 
>fine control on these masks with the pthread_sigmask() function.

--

The Windows part is more tricky. A Windows Event object (created by 
CreateEvent() and retrieved by _PyOS_SigintEvent()) is used to interrupt a few 
blocking functions:

* my_fgets() used by PyOS_StdioReadline() to implemented "readline" (especially 
for the REPL)
* _PyOS_WindowsConsoleReadline()
* read_console_w() of Modules/_io/winconsole.c
* time.sleep() -- only if it's the main thread and delay != 0 seconds
* _multiprocessing.SemLock.acquire() -- only if called from the main thread
* _winapi.WaitForMultipleObjects()

The event is set by the SIGINT signal handler set by Python.

Extract of pysleep() comment:

/* Allow sleep(0) to maintain win32 semantics, and as decreed
 * by Guido, only the main thread can be interrupted.
 */

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread STINNER Victor

STINNER Victor added the comment:

Attached signal_pause_doesnt_wake_up.py is a little bit complex and not sure 
that it's reliable.

I wrote thread_signal.py which should have a more deterministic behaviour. 
Output on Linux:
---
main: main thread 140392674538560
thread: wait
main: spawn thread 140392542746368
main: sleep
main: send signal to thread
main: sleep again
Python signal handler, thread 140392674538560
main: wait thread
thread: done
main: done
---

As expected, the Python signal handler is called in the main thread.

time.sleep() in the thread *is* interrupted by SIGUSR1 (select() fails with 
EINTR), but PyErr_CheckSignals() does nothing since it's not the main thread. 
Then the sleep is automatically restarted (PEP 475).

The code works as expected, but I understand that the behaviour is surprising.

To be honest, I never understood the rationale behind "only execute signal 
handlers in the main thread". I was also surprised to no see a pthread_kill() 
in the C signal handler.

I dislike the idea of transfering the signal to the main thread from another 
thread in the C signal handler using pthread_kill(). Most code behave very 
badly when getting a signal, so getting a signal twice is likely to double the 
pain.

Moreover, pthread_sigmask() can block signals in the main thread for deliberate 
reasons.

If we should change something, I simply suggest to remove the arbitrary 
limitation from the C signal handler. I don't know the consequences yet.

--
Added file: http://bugs.python.org/file46752/thread_signal.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-22 Thread STINNER Victor

STINNER Victor added the comment:

Nathaniel Smith added the comment:
> In any case, the issue here remains that one shouldn't have to use 
> set_wakeup_fd for a signal to interrupt time.sleep etc.

CPython implements a C signal handler which sets a flag and then exits
immediately. The thread getting the signal will probably interrupts
the current blocking syscall with EINTR. CPython detects that a signal
flag was set, calls the *Python* signal handler. If the Python signal
handler raises an exception, the syscall is abandonned. Otherwise, the
syscall is restarted.

So "interrupting sleep" is only reliable if:

* there is only one thread (the "main" thread)
* the syscall is interrupted with EINTR
* the signal handler raises an exception

When you implement an event loop, raising an exception may be the best
design. In asyncio, the Python signal handler calls loop.call_soon()
to execute the final callback (3rd signal handler of the same
signal!). The rule in asynico is not call blocking syscalls, or at
least ensure that they don't block too long (ex: use aiofiles to
access the filesystem).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@haypo: okay, looked over things over for a third time and this time I found my 
very silly error :-). So I'm now able to use set_wakeup_fd on Windows 
(https://github.com/python-trio/trio/pull/108), but not on Unix 
(https://github.com/python-trio/trio/issues/109).

In any case, the issue here remains that one shouldn't have to use 
set_wakeup_fd for a signal to interrupt time.sleep etc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread Nathaniel Smith

Nathaniel Smith added the comment:

@haypo: It's a socketpair. It works fine when I set up a toy test case using 
set_wakeup_fd + select, and it works fine in my real code when I use CFFI 
cleverness to register a signal handler that manually writes a byte to my 
wakeup socket, but when I pass that exact same socket to set_wakeup_fd in my 
real code, it doesn't work. It's pretty mysterious, and I have no particular 
reason to think that the problem is in CPython as opposed to some stupid 
mistake I'm making.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread STINNER Victor

STINNER Victor added the comment:

2017-03-22 1:55 GMT+01:00 Nathaniel Smith :
> + for some reason set_wakeup_fd doesn't work for me on Windows (no idea why, 
> can't reproduce in simplified tests, might be my fault, need to debug),

I modified signal.set_wakeup_fd() to accept a socket handle on
Windows. Are you using socket.socketpair() on Windows? Or do you pass
a pipe?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-21 Thread Nathaniel Smith

Nathaniel Smith added the comment:

I don't really have a specific use case personally -- for trio, I haven't found 
a way to make use of set_wakeup_fd because of various issues[1], but I'm also 
not planning to use SIGCHLD, so this isn't very urgent.

In general set_wakeup_fd can be a workaround, but I wanted to note this 
upstream because it's a bug in Python's signal handler logic. Note that Python 
already goes to great lengths to make e.g. signal handlers run during 
time.sleep on Windows; they ought to just work on Unix too.

--

[1] (Off-topic but in case you're curious: I register actual signal handlers 
anyway because I follow the philosophy that the wakeup pipe should only be used 
for wakeups rather than transmitting information, so as long as signal handlers 
work I can do my own wakeups, + for some reason set_wakeup_fd doesn't work for 
me on Windows (no idea why, can't reproduce in simplified tests, might be my 
fault, need to debug), + set_wakeup_fd's handling of buffer overflow is broken 
for my purposes.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-20 Thread STINNER Victor

STINNER Victor added the comment:

If the main thread waits on select() and uses a pipe with
signal.set_wakeup_fd(), it should be fine. Is that your case?

Asyncio requires to have an event loop running in the main thread to spawn
child processes, to get SIGCHLD signals to get notified of child
completions. Or you can ignore these signals and use the "safe" child
waiter using waitall() and then iterate on all children.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21895] signal.pause() and signal handlers don't react to SIGCHLD in non-main thread

2017-03-20 Thread Nathaniel Smith

Nathaniel Smith added the comment:

It turns out that this bug is more general than signal.pause, and has caused 
problems for a few different people:
  https://github.com/dabeaz/curio/issues/118#issuecomment-287735781
  https://github.com/dabeaz/curio/issues/118#issuecomment-287798241
  https://github.com/dabeaz/curio/issues/118#issuecomment-287887843

The basic problem is that CPython's signal handling strategy on Unix-likes 
assumes that if a signal is delivered to the process at a time when the main 
thread is blocked in a syscall, then that syscall will promptly exit with 
EINTR. So for example, time.sleep has some clever code on Windows to make it 
interruptible with control-C, but on Unix we assume that the kernel will break 
the sleep for us.

The problem with this is that POSIX only guarantees that *some* thread will 
receive the signal -- not necessarily the main thread. In practice it seems 
like most implementations do deliver most signals to the main thread or CPython 
wouldn't have gotten away with this for as long as it has, but in particular it 
seems like Linux is happy to deliver SIGCHLD to random other threads. So the 
C-level signal handler runs in whatever thread, it sets the flag for the main 
thread to run the Python-level signal handler... and then the main thread sits 
there blocked in sleep() or select() or pause() or whatever, and never checks 
the flag.

A simple solution would be to make sure signals are always delivered to the 
main thread, by adjusting the C-level signal handler to do something like:

if (current_thread != main_thread) {
pthread_kill(main_thread, sig_num);
return;
}

--
nosy: +njs
title: signal.pause() doesn't wake up on SIGCHLD in non-main thread -> 
signal.pause() and signal handlers don't react to SIGCHLD in non-main thread
versions: +Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com