[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-11-15 Thread Sam Gross


Sam Gross  added the comment:

The `pthread_exit` behavior has been a problem for PyTorch and related 
libraries since Python 3.9. The PyTorch team has tried working around the 
problems without success (i.e. they keep getting bug reports involving crashes 
in PyEval_SaveThread/RestoreThread).

The hang/paused the thread behavior suggested by jbms and gps seems like the 
only reliable option. This is also what the Java VM does when returning from 
native code and the JVM has exited.

I believe it's not difficult to hang a thread in a cross-platform way: create a 
mutex, acquire it in the main thread (before setting PyRuntime._finalizing), 
never release it. Other threads can acquire that same mutex to block until the 
application exits.

The crashes can occur even without daemon threads if the user presses ctrl-c 
while _thread_shutdown is running.

--
nosy: +colesbury

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-26 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

Yes, I would agree that the new APIs are a useful addition regardless of the 
PyThread_exit_thread change.

As far as the proposed `Py_SetThreadExitCallback` that seems like a fine thing 
for applications to use, as long as it doesn't impact how extensions need to be 
written to be safe from crashes/memory corruption.  So for example if the 
default is to hang, then changing it to log and then hang, or optionally log 
and terminate the program, would be fine, since extensions aren't affected 
either way.

Conversely, if one of the possible behaviors may be `_endthreadex` or 
`pthread_exit`, then libraries must be written to be safe under that behavior 
anyway, which is unfortunate.  Furthermore, say for a library that only 
supports POSIX, maybe it is written to be safe under `pthread_exit` because it 
uses destructors to do cleanup, but then it will cause deadlock if the callback 
chooses to hang the thread instead.  Thus, I think allowing the callback to 
change the behavior in a way that could impact extensions is not a great idea.

The callback also doesn't seem like a very good mechanism for an extension that 
is incompatible with `pthread_exit` or `_endthreadex`, such as one using 
pybind11, to try to mitigate that problem, since an individual library 
shouldn't be changing application-wide behavior unless the library is 
specifically being used by the application for that purpose.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-25 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

**If** we add a callback API for this, I agree with vstinner's 
https://bugs.python.org/issue42969#msg402558 commentary on the other callback 
APIs.  We can do this one similarly and keep it simple. - Why? It's the initial 
simplicity that provides those running into this problem a way to solve their 
problem today.  Without getting hung up on the details of the new default 
behavior.

**That said** the only conclusion I'm coming to for what the safest behavior of 
Python is in this situation is to hang the threads, effectively as PR 28525 is 
proposing.

So even if we added the callback API, I'd expect code using Python with C++ 
where these issues might ever crop up to always register a thread hanging 
callback.  Which really suggests it is the good default.  So do we even need 
the callback?

...But lets put the default behavior choice aside for a moment, there's 
something valuable regardless...

There are **New C APIs to enhance PyGILState_Ensure** proposed in PR 28525 to 
do with GIL acquisition from threads.  These appear useful as is.  They provide 
a way for threads to discover that they will never be able to get the GIL and 
re-enter the Python interpreter.  Rather than today's unexpected behavior of 
PyGILState_Ensure mercilessly terminating their thread, or given a Callback API 
whatever effect such a Callback API would have.  That allows code to be written 
with pre-problem-detection that avoids entering this point of no return state.  
That is good for everyone.  **We should add those "GIL-curious" APIs no matter 
what.**  This bit could become its own PR separate from the other parts.

If we cannot agree that blocking a non-main daemon-or-C/C++ thread forever that 
is guaranteed to not acquire the GIL because the Python interpreter is going 
away is the right default behavior instead of blindly killing it under the 
unsupportable assumption that it has nothing to clean up: We can proceed with 
such a callback API. I'm having a hard time imagining any other behavior that 
makes sense, so I'd expect lots of Python extension interface code to start 
carrying a copy of an implementation of a hang callback and sporting a 
Py_SetThreadExitCallback(xxx) call in their module Init function. (especially 
if pybind11 finds needs to generate this as boilerplate)

I think the pthread_exit() call added in issue1856's 
https://hg.python.org/cpython/rev/c892b0321d23 were well intentioned, but not 
aware of the full unsupportable ramifications of that API call. That the 
attempt at releasing a backport of the pthread_exit call to 2.7.8 broke an 
application (that was admittedly doing something unwise) and had to be reverted 
as 2.7.9 was a sign for us to revisit this in 3.x releases as well.  Which 
we're finally doing here.

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-24 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

To be clear, the problem I'm trying to address here is not specific to 
embedding Python in a C++ application.  In fact the issue came to my attention 
while using Python directly, but loading an extension module that was written 
in C++ using the popular pybind11 library.

If we continue having Python call `pthread_exit` and `_endthreadex`, we are 
imposing strong constraints on call stacks that call the Python API.  Granted, 
hanging a thread is also not something a well-behaved library should do, but it 
is at least slightly better than killing the thread.  In a sense hanging is 
also logical, since the thread has requested to block until the GIL can be 
acquired, and the GIL cannot be acquired.

I have described a number of problems caused by `pthread_exit`/`_endthreadex` 
that are fixed by hanging.  Can you help me understand what problems caused by 
hanging are fixed by `pthread_exit`/`_endthreadex`, that leads you to think it 
is a better default?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-24 Thread STINNER Victor


STINNER Victor  added the comment:

Jeremy Maitin-Shepard: "In general, I view hanging threads as the least bad 
thing to do when faced with either acquiring the GIL or not returning at all.  
There is a lot of existing usage of Python that currently poses a risk of 
random crashes and memory corruption while Python is exiting, and I would like 
to fix that."

Showing warnings by default or not was discussed many times in Python. It was 
decided to *hide* DeprecationWarning by default. The PEP 565 is a minor 
trade-off to show them in the __main__ module.

For me, more generally, Python default behavior is designed for *users* who 
don't want to be annoyed by warnings or anything which would make sense for 
*developers*. That's why I designed a new "Python Development Mode" (-X dev):
https://docs.python.org/dev/library/devmode.html

Maybe in development mode, the behavior could be changed to call abort(). But 
honestly, I'm not really excited by that. I'm not embedding Python in a C++ 
application. I'm almot only use Python directly: the Unix command "python3". 
For this use case, IMO it's fine to call pthread_exit() by default.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-24 Thread STINNER Victor


STINNER Victor  added the comment:

Another example where a developer asks to call abort() to notice bugs, whereas 
Python previously silently ignored it: bpo-36829. Calling abort() is a legit 
use case, but not really the best default behavior. Again, the problem was 
solved by letting developers setting their own callback: sys.unraisablehook.

If I understood correctly, pytest doesn't override it but "took" into the 
default implementation: it chains its own code with the default implementation. 
It's possible because there is a way to "get" the current hook: just read 
sys.unraisablehook ;-)

Another argument in favor of also adding Py_GetThreadExitCallback() ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-24 Thread STINNER Victor


STINNER Victor  added the comment:

Gregory P. Smith: Python has many API using callbacks:

PEP 445 added PyMem_SetAllocator() to set memory allocator. Adding 
PyMem_GetAllocator() also made possible to chain allocators and to "hook" into 
an existing allocator to execute code before and after it's called (the PEP 
contains an example).

I'm not sure if it's important or useless to chain callbacks with 
Py_SetThreadExitCallback(). I suggest to always override the previously set 
callback.

It would matter if library A sets a callback to emit log and library B sets a 
callback to hang threads. It maybe be nice to first emit a log and then hang 
the thread. But then the order in which callbacks are set starts to matter a 
lot :-)

I'm fine with adding Py_GetThreadExitCallback() if you consider that it matters.


> If someone passes nullptr does that undo it (please no!).

I don't think that we should bother with adding a special case. I prefer to 
consider developers as adults and let them make their own mistakes if they 
consider that they understand the code well enough ;-)

_PyEval_SetTrace() allows to remove the current trace function. It's a legit 
use case.

If library C is annoyed by library A and library B installed annoying 
callbacks, IMO it's also ok to let it "remove" the previously set callback, no?

IMO Py_SetThreadExitCallback(NULL) should simply set the callback to NULL, so 
restore the default behavior: call pthread_exit().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-24 Thread STINNER Victor


STINNER Victor  added the comment:

> In this case that unexpected thread simply disappearing can lead to a 
> deadlock in our process.

This problem also remains me the very complex case of bpo-6721: "Locks in the 
standard library should be sanitized on fork". The issue title looks simple, 
but 12 years after the issue was created, it's still open.

This issue is being solved by adding atfork callbacks to modules which must do 
something at fork in the child process (sometimes also in the parent process).

I added threading.Lock._at_fork_reinit() private method to simplify the 
implementation of these callbacks.

Such problem has no silver bullet solution, so it's better to let developers 
design their own solution with their specific requirements.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-23 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

In general, I view hanging threads as the least bad thing to do when faced with 
either acquiring the GIL or not returning at all.  There is a lot of existing 
usage of Python that currently poses a risk of random crashes and memory 
corruption while Python is exiting, and I would like to fix that.

However, I would certainly recommend that code using the Python C API attempt 
to avoid threads getting to that state in the first place.  I added a "finalize 
block" mechanism to that PR which is intended to provide a way to attempt to 
acquire the GIL in a way that ensures the GIL won't get hung.  I would welcome 
feedback on that.  A common use case for that API might be a non-Python created 
thread that wants to invoke some sort of asynchronous callback handler via 
Python APIs.

For Python daemon threads that you control, you can avoid them hanging by 
registering an atexit function that signals them to exit and then waits until 
they do.

vsteinner: Regarding processing the Windows messages, I updated the PR to 
include a link to the MSDN documentation that led me to think it was a good 
idea.

vstinner: As for random code outside of Python itself that is using 
`PyThread_exit_thread`: although I suppose there are legitimate use cases for 
`pthread_exit` and `_endthreadex`, these functions are only safe with the 
cooperation of the entire call stack of the thread.  Additionally, while 
`pthread_exit` and `_endthreadex` have similar behavior for C code, they don't 
have the same behavior for C++ code, and that difference seems like a likely 
source of problems.  Finally, I would say Python itself does not guarantee that 
its call stacks safely cooperate with `pthread_exit` (at the very least, there 
are sure to be memory leaks).  Therefore, I think Python should not encourage 
its use by leaving it as a non-deprecated public API.  If a user wishes to 
terminate a thread, they can invoke `pthread_exit` or `_endthreadex` directly, 
ideally without any Python functions in the call stack, and can refer to the 
documentation of those functions to understand the implications.

gps: The reasons I believe hanging the thread is better than `pthread_exit`:
- `pthread_exit` essentially throws an exception.  In theory that means you 
could do proper cleanup via C++ destructors and/or re-throwing catch blocks.  
But in practice existing extension code is not designed to do that, and it 
would be quite a complex task to modify it to do proper cleanup, and on Windows 
the cleanup wouldn't run anyway.
- Additionally, throwing an exception means if there is a `noexcept` function 
in the call stack, the program terminates.  We would need to document that you 
aren't allowed to call Python APIs if there is a `noexcept` function in the 
call stack.  If you have a `catch(...)` in the call stack, then you may 
inadvertently catch the exception and return control back to Python at a point 
that assumes it owns the GIL, which will cause all sorts of havoc.  We would 
likewise need to document that you can't have a non-rethrowing `catch(...)` 
block in the call stack (I believe pybind11 has some of those).
- Throwing an exception also means C++ destructors run.  pybind11 has a smart 
pointer type that holds a `PyObject` and whose destructor calls `Py_DECREF`.  
That causes a crash when `pthread_exit` unwinds the stack, since the thread 
doesn't own the GIL.

Those are the additional problems specific to `pthread_exit`.  As gps noted, 
there is the additional problem of memory corruption common to both 
`pthread_exit` and `_endthreadex`:
- Data structures that are accessible from other threads may contain pointers 
to data on the thread's stack.  For example, with certain types of 
locks/signalling mechanisms it is common to store a linked list node on the 
stack that as then added to a list of waiting threads.  If we destroy the 
thread stack without proper cleanup (and that proper cleanup definitely won't 
happen with `_endthreadex`, and probably in most cases still won't happen with 
`pthread_exit`), the data structure has now become corrupted.

I don't think hanging the thread really increases the risk of deadlock over the 
status quo.  In theory someone could have a C++ destructor that does some 
cleanup that safely pevents deadlock, but that is not portable to Windows, and 
I expect that properly implemented `pthread_exit`-safe code is extremely rare.

I think we would want to ensure that Python itself is implemented in such a way 
as to not deadlock if Python-created threads with only Python functions in the 
call stack hang.  Mostly that would amount to not holding mutexes while calling 
functions that may transitively attempt to acquire the GIL (or release and then 
re-acquire the GIL).  That is probably a good practice for avoiding deadlock 
even when not finalizing.

We would also want to document that external code using the Python API should 
be safe from deadlock if a 

[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-23 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

A PR adding a `Py_SetThreadExitCallback(func)` API guaranteeing the callback is 
called before `pthread_exit()` would allow anyone wanting to deal with their 
deadlocks to register `abort()` or `while(1) pause();` as their exit callback.  
I'd be okay with that.  Anyone care to make a PR for that?

What should it do when SetThreadExitCallback has already been called?  Is that 
an error?  Are the callbacks chained?  In which order?  If someone passes 
nullptr does that undo it (please no!).  It is process global state that many 
things could wind up having an opinion on each with their own reason to require 
theirs to be the only one.  I vote for returning an error if a callback has 
already been set.  And not allowing unsetting a callback.

What we'd do internally at work is always guarantee our codebase's early 
application startup code (because we have such a thing) calls that to setup 
whichever exit callback we deem appropriate for everyone instead of today's 
default deadlock potential.

On pausing... agreed, it doesn't feel _comfortable_.  To me when faced with a 
known potential deadlock situation the only comfortable thing is to abort() as 
a process dying is always more useful than process hanging (or worse, partially 
hanging).

Sleeping on the problem, I don't really understand how `while(1) pause();` is 
significantly different than `pthread_exit()` when it comes to deadlocks, as it 
seems like an instantly terminated thread having state that needs cleanup 
should lead to a similar outcome as a thread with stuff needing cleanup that is 
now stuck in an infinite pause loop.  Neither of them is going to cleanup 
whatever (presumably a lock they hold) that leads something else to deadlock?  
I guess the difference is that the thread stack  memory is at least not 
released back for potential reuse while other threads and pointers could still 
be referring to it when pausing as opposed to a pthread_exit?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-23 Thread STINNER Victor


STINNER Victor  added the comment:

I'm not comfortable with PR 28525 which always hang threads which attempt to 
acquire the GIL after Python exited.

I would prefer to keep the current behavior by default, but give the ability to 
applications embedding Python to decide what to do.

With my Py_SetThreadExitCallback(func) idea, PyThread_exit_thread() would call 
func() and then pthread_exit(0). Applications can hang threads, log a message, 
call abort(), or whatever else.

I'm not comfortable with writing a portable function to "hang a thread". For 
example, I don't understand why PR 28525 processes Windows messages on hang 
threads.

Well, it's a complex problem :-(

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-22 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

I suppose calling `Py_Initialize`, `Py_FinalizeEx`, then `Py_Initialize` again, 
then `Py_FinalizeEx` again in an embedding application, was already not 
particularly well supported, since it would leak memory.

However, with this change it also leaks threads.  That is a bit unfortunate, 
but I suppose it is just another form of memory leak, and the user can avoid it 
by ensuring there are no daemon threads (of course even previously, the 
presence of any daemon threads meant additional memory leaking).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-22 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

I believe jbms is right that pausing the threads is the only right thing to do 
when they see tstate_must_exit.  The PR is likely correct.

--
versions: +Python 3.11, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-22 Thread Jeremy Maitin-Shepard


Change by Jeremy Maitin-Shepard :


--
keywords: +patch
pull_requests: +26916
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/28525

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-20 Thread STINNER Victor


STINNER Victor  added the comment:

> It looks like the `_thread` module does not actually expose 
> `PyThread_exit_thread` --- the similarly named `thread_PyThread_exit_thread` 
> just raises SystemExit.

Oh right, I was confused by the function name: "thread_PyThread_exit_thread()". 
It's a good thing that it's not exposed in Python :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-20 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

It looks like the `_thread` module does not actually expose 
`PyThread_exit_thread` --- the similarly named `thread_PyThread_exit_thread` 
just raises SystemExit.

>From a search in the codebase, it appears `PyThread_exit_thread` is currently 
>used only to kill threads when they attempt to acquire the GIL during 
>finalization.

Also, if it is changed to no longer kill the thread, it would probably make 
sense to rename it, e.g. to `PyThread_stop_thread_during_finalization`.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-20 Thread STINNER Victor


STINNER Victor  added the comment:

See also bpo-13077 "Windows: Unclear behavior of daemon threads on main thread 
exit".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-20 Thread STINNER Victor


STINNER Victor  added the comment:

I don't think that there is a "good default behavior" where Python currently 
calls PyThread_exit_thread().

IMO we should take the problem from the other side and tries to reduce cases 
when Python can reach this case. Or even make it impossible if possible. For 
example, *removing* daemon threads would remove the most common case when 
Python has to call PyThread_exit_thread().

I'm not sure how to make this case less likely when threading._shutdown() is 
interrupted by CTRL+C. This function can likely hang if a thread is stuck for 
whatever reason. It's important than an user is able to "interrupt" or kill a 
stuck process with CTRL+C (SIGINT). It's a common expectation on Unix, at least 
for me.

Maybe threading._shutdown() should be less nice and call os._exit() in this 
case: exit *immediately* the process in this case. Or Python should restore the 
default SIGINT handler: on Unix, the default SIGINT handler immediately 
terminate the process (like os._exit() does).

I don't think that abort() should be called here (raise SIGABRT signal), since 
the intent of an user pressing CTRL+C is to silently terminate the process. 
It's not an application bug, but an user action.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-20 Thread STINNER Victor


STINNER Victor  added the comment:

PyThread_exit_thread() is exposed as _thread.exit() and _thread.exit_thread().

PyThread_exit_thread() is only called in take_gil() (at 3 places in the 
function) if tstate_must_exit(tstate) is true. It happens in two cases:

* (by design) at Python exit if a daemon thread tries to "take the GIL": 
PyThread_exit_thread() is called.

* (under an user action) at Python exit if threading._shutdown() is interrupted 
by CTRL+C: Python (regular) threads will continue to run while Py_Finalize() is 
running. In this case, when a (regular) thread tries to "take the GIL", 
PyThread_exit_thread() is called.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-16 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

Regarding your suggestion of banning daemon threads: I happened to come across 
this bug not because of daemon threads but because of threads started by C++ 
code directly that call into Python APIs.  The solution I am planning to 
implement is to add an `atexit` handler to prevent this problem.

I do think it is reasonable to suggest that users should ensure daemon threads 
are exited cleanly via an atexit handler.  However, in some cases that may be 
challenging to implement, and there is also the issue of backward compatibility.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-16 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

Regarding your suggestion of adding a hook like `Py_SetThreadExitCallback`, it 
seems like there are 4 plausible behaviors that such a callback may implement:

1. Abort the process immediately with an error.

2. Exit immediately with the original exit code specified by the user.

3. Hang the thread.

4. Attempt to unwind the thread, like `pthread_exit`, calling pthread thread 
cleanup functions and C++ destructors.

5. Terminate the thread immediately without any cleanup or C++ destructor calls.

The current behavior is (4) on POSIX platforms (`pthread_exit`), and (5) on 
Windows (`_endthreadex`).

In general, achieving a clean shutdown will require the cooperation of all 
relevant code in the program, particularly code using the Python C API.  
Commonly the Python C API is used more by library code rather than application 
code, while it would presumably be the application that is responsible for 
setting this callback.  Writing a library that supports multiple different 
thread shutdown behaviors would be particularly challenging.

I think the callback is useful, but we would still need to discuss what the 
default behavior should be (hopefully different from the current behavior), and 
what guidance would be provided as far as what the callback is allowed to do.

Option (1) is highly likely to result in a user-visible error --- a lot of 
Python programs that previously exited successfully will now, possibly only 
some of the time, exit with an error.  The advantage is the user is alerted to 
the fact that some threads were not cleanly exited, but a lot of previously 
working code is now broken.  This seems like a reasonable policy for a given 
application to impose (effectively requiring the use of an atexit handler to 
terminate all daemon threads), but does not seem like a reasonable default 
given the existing use of daemon threads.

Option (2) would likely do the right thing in many cases, but main thread 
cleanup that was previously run would now be silently skipped.  This again 
seems like a reasonable policy for a given application to impose, but does not 
seem like a reasonable default.

Option (3) avoids the possibility of crashes and memory corruption.  Since the 
thread stack remains allocated, any pointers to the thread stack held in global 
data structures or by other threads remain valid.  There is a risk that the 
thread may be holding a lock, or otherwise block progress of the main thread, 
resulting in silent deadlock.  That can be mitigated by registering an atexit 
handler.

Option (4) in theory would allow cleanup handlers to be registered in order to 
avoid deadlock due to locks held.  In practice, though, it causes a lot of 
problems:
 - The CPython codebase itself contains no such cleanup handlers, and I expect 
the vast majority of existing C extensionss are also not designed to properly 
handle the stack unwind triggered by `pthread_exit`.  Without proper cleanup 
handlers, this option reverts to option (5), where there is a risk of memory 
corruption due to other threads accessing pointers to the freed thread stack.  
There is also the same risk of deadlock as in option (3).
 - Stack unwinding interacts particularly badly with common C++ usage because 
the very first thing most people want to do when using the Python C API from 
C++ is create a "smart pointer" type for holding a `PyObject` pointer that 
handles the reference counting automatically (calls `Py_INCREF` when copied, 
`Py_DECREF` in the destructor).  When the stack unwinds due to `pthread_exit`, 
the current thread will NOT hold the GIL, and these `Py_DECREF` calls result in 
a crash / memory corruption.  We would need to either create a new 
finalizing-safe version of Py_DECREF, that is a noop when called from a 
non-main thread if `_Py_IsFinalizing()` is true (and then existing C++ 
libraries like pybind11 would need to be changed to use it), or modify the 
existing `Py_DECREF` to always have that additional check.  Other calls to 
Python C APIs in destructors are also common.
 - When writing code that attempts to be safe in the presence of stack 
unwinding due to `pthread_exit`, it is not merely explicitly GIL-related calls 
that are a concern.  Virtually any Python C API function can transitively 
release and acquire the GIL and therefore you must defend against unwind from 
virtually all Python C API functions.
 - Some C++ functions in the call stack may unintentionally catch the exception 
thrown by `pthread_exit` and then return normally.  If they return back to a 
CPython stack frame, memory corruption/crashing is likely.
 - Alternatively, some C++ functions in the call stack may be marked 
`noexcept`.  If the unwinding reaches such a function, then we end up with 
option (1).
  - In general this option seems to require auditing and fixing a very large 
amount of existing code, and introduces a lot of complexity.  For that reasons, 
I think this option should be 

[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-16 Thread STINNER Victor


STINNER Victor  added the comment:

> Change CPython to call abort() instead of pthread_exit() as that situation is 
> unresolvable and the process dying is better than hanging, partially alive.  
> That solution isn't friendly, but is better than being silent and allowing 
> deadlock.  A failing process is always better than a hung process, especially 
> a partially hung process.

The last time someone proposed to always call abort(), I proposed to add a hook 
instead: I added sys.unraisablehook. See bpo-36829.

If we adopt this option, it can be a callback in C, something like: 
Py_SetThreadExitCallback(func) which would call func() rather than 
pthread_exit() in ceval.c.

--

Another option would be to add an option to disable daemon thread.

concurrent.futures has been modified to no longer use daemon threads: bpo-39812.

It is really hard to write a reliable implementation of daemon threads with 
Python subintepreters. See bpo-40234 "[subinterpreters] Disallow daemon threads 
in subinterpreters optionally".

There is already a private flag for that in subinterpreters to disallow 
spawning processes or threads: an "isolated" subintepreter. Example with 
_thread.start_new_thread():

PyInterpreterState *interp = _PyInterpreterState_GET();
if (interp->config._isolated_interpreter) {
PyErr_SetString(PyExc_RuntimeError,
"thread is not supported for isolated subinterpreters");
return NULL;
}

Or os.fork():

if (interp->config._isolated_interpreter) {
PyErr_SetString(PyExc_RuntimeError,
"fork not supported for isolated subinterpreters");
return NULL;
}

See also my article on fixing crashes with daemon threads:

* https://vstinner.github.io/gil-bugfixes-daemon-threads-python39.html
* https://vstinner.github.io/daemon-threads-python-finalization-python32.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-09-15 Thread Jeremy Maitin-Shepard


Jeremy Maitin-Shepard  added the comment:

Another possible resolution would to simply make threads that attempt to 
acquire the GIL after Python starts to finalize hang (i.e. sleep until the 
process exits).  Since the GIL can never be acquired again, this is in some 
sense the simplest way to fulfill the contract.  This also ensures that any 
data stored on the thread call stack and referenced from another thread remains 
valid.  As long as nothing on the main thread blocks waiting for one of these 
hung threads, there won't be deadlock.


I have a case right now where a background thread (created from C++, which is 
similar to a daemon Python thread) acquires the GIL, and calls 
"call_soon_threadsafe" on an asycnio event loop.  I think that causes some 
Python code internally to release the GIL at some point, after triggering some 
code to run on the main thread which happens to cause the program to exit.  
While `Py_FinalizeEx` is running, the call to "call_soon_threadsafe" completes 
on the background thread, attempts to re-acquire the GIL, which triggers a call 
to pthread_exit.  That unwinds the C++ stack, which results in a call to 
Py_DECREF without the GIL held, leading to a crash.

--
nosy: +jbms

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-06-28 Thread STINNER Victor


STINNER Victor  added the comment:

See also a discussion about the usefulness of daemon threads:
https://github.com/python-trio/trio/issues/2046

I'm more in favor of deprecating daemon threads (in any interpreter, not only 
in subinterpreters). The current implementation is too fragile. There are still 
corner cases like the one described in this issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-06-28 Thread STINNER Victor


STINNER Victor  added the comment:

See also bpo-44434: "_thread module: Remove redundant PyThread_exit_thread() 
call to avoid glibc fatal error: libgcc_s.so.1 must be installed for 
pthread_cancel to work".

New changeset 45a78f906d2d5fe5381d78466b11763fc56d57ba by Victor Stinner in 
branch 'main':
bpo-44434: Don't call PyThread_exit_thread() explicitly (GH-26758)
https://github.com/python/cpython/commit/45a78f906d2d5fe5381d78466b11763fc56d57ba

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-01-21 Thread Alexey Izbyshev


Change by Alexey Izbyshev :


--
nosy: +izbyshev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

2021-01-19 Thread Gregory P. Smith


Change by Gregory P. Smith :


--
title: pthread_exit & PyThread_exit_thread are harmful -> pthread_exit & 
PyThread_exit_thread from PyEval_RestoreThread etc. are harmful

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com