[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: Yes, I would agree that the new APIs are a useful addition regardless of the PyThread_exit_thread change. As far as the proposed `Py_SetThreadExitCallback` that seems like a fine thing for applications to use, as long as it doesn't impact how extensions need to be written to be safe from crashes/memory corruption. So for example if the default is to hang, then changing it to log and then hang, or optionally log and terminate the program, would be fine, since extensions aren't affected either way. Conversely, if one of the possible behaviors may be `_endthreadex` or `pthread_exit`, then libraries must be written to be safe under that behavior anyway, which is unfortunate. Furthermore, say for a library that only supports POSIX, maybe it is written to be safe under `pthread_exit` because it uses destructors to do cleanup, but then it will cause deadlock if the callback chooses to hang the thread instead. Thus, I think allowing the callback to change the behavior in a way that could impact extensions is not a great idea. The callback also doesn't seem like a very good mechanism for an extension that is incompatible with `pthread_exit` or `_endthreadex`, such as one using pybind11, to try to mitigate that problem, since an individual library shouldn't be changing application-wide behavior unless the library is specifically being used by the application for that purpose. -- ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: To be clear, the problem I'm trying to address here is not specific to embedding Python in a C++ application. In fact the issue came to my attention while using Python directly, but loading an extension module that was written in C++ using the popular pybind11 library. If we continue having Python call `pthread_exit` and `_endthreadex`, we are imposing strong constraints on call stacks that call the Python API. Granted, hanging a thread is also not something a well-behaved library should do, but it is at least slightly better than killing the thread. In a sense hanging is also logical, since the thread has requested to block until the GIL can be acquired, and the GIL cannot be acquired. I have described a number of problems caused by `pthread_exit`/`_endthreadex` that are fixed by hanging. Can you help me understand what problems caused by hanging are fixed by `pthread_exit`/`_endthreadex`, that leads you to think it is a better default? -- ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: In general, I view hanging threads as the least bad thing to do when faced with either acquiring the GIL or not returning at all. There is a lot of existing usage of Python that currently poses a risk of random crashes and memory corruption while Python is exiting, and I would like to fix that. However, I would certainly recommend that code using the Python C API attempt to avoid threads getting to that state in the first place. I added a "finalize block" mechanism to that PR which is intended to provide a way to attempt to acquire the GIL in a way that ensures the GIL won't get hung. I would welcome feedback on that. A common use case for that API might be a non-Python created thread that wants to invoke some sort of asynchronous callback handler via Python APIs. For Python daemon threads that you control, you can avoid them hanging by registering an atexit function that signals them to exit and then waits until they do. vsteinner: Regarding processing the Windows messages, I updated the PR to include a link to the MSDN documentation that led me to think it was a good idea. vstinner: As for random code outside of Python itself that is using `PyThread_exit_thread`: although I suppose there are legitimate use cases for `pthread_exit` and `_endthreadex`, these functions are only safe with the cooperation of the entire call stack of the thread. Additionally, while `pthread_exit` and `_endthreadex` have similar behavior for C code, they don't have the same behavior for C++ code, and that difference seems like a likely source of problems. Finally, I would say Python itself does not guarantee that its call stacks safely cooperate with `pthread_exit` (at the very least, there are sure to be memory leaks). Therefore, I think Python should not encourage its use by leaving it as a non-deprecated public API. If a user wishes to terminate a thread, they can invoke `pthread_exit` or `_endthreadex` directly, ideally without any Python functions in the call stack, and can refer to the documentation of those functions to understand the implications. gps: The reasons I believe hanging the thread is better than `pthread_exit`: - `pthread_exit` essentially throws an exception. In theory that means you could do proper cleanup via C++ destructors and/or re-throwing catch blocks. But in practice existing extension code is not designed to do that, and it would be quite a complex task to modify it to do proper cleanup, and on Windows the cleanup wouldn't run anyway. - Additionally, throwing an exception means if there is a `noexcept` function in the call stack, the program terminates. We would need to document that you aren't allowed to call Python APIs if there is a `noexcept` function in the call stack. If you have a `catch(...)` in the call stack, then you may inadvertently catch the exception and return control back to Python at a point that assumes it owns the GIL, which will cause all sorts of havoc. We would likewise need to document that you can't have a non-rethrowing `catch(...)` block in the call stack (I believe pybind11 has some of those). - Throwing an exception also means C++ destructors run. pybind11 has a smart pointer type that holds a `PyObject` and whose destructor calls `Py_DECREF`. That causes a crash when `pthread_exit` unwinds the stack, since the thread doesn't own the GIL. Those are the additional problems specific to `pthread_exit`. As gps noted, there is the additional problem of memory corruption common to both `pthread_exit` and `_endthreadex`: - Data structures that are accessible from other threads may contain pointers to data on the thread's stack. For example, with certain types of locks/signalling mechanisms it is common to store a linked list node on the stack that as then added to a list of waiting threads. If we destroy the thread stack without proper cleanup (and that proper cleanup definitely won't happen with `_endthreadex`, and probably in most cases still won't happen with `pthread_exit`), the data structure has now become corrupted. I don't think hanging the thread really increases the risk of deadlock over the status quo. In theory someone could have a C++ destructor that does some cleanup that safely pevents deadlock, but that is not portable to Windows, and I expect that properly implemented `pthread_exit`-safe code is extremely rare. I think we would want to ensure that Python itself is implemented in such a way as to not deadlock if Python-created threads with only Python functions in the call stack hang. Mostly that would amount to not holding mutexes while calling functions that may transitively attempt to acquire the GIL (or release and then re-acquire the GIL). That is probably a good practice for avoiding deadlock even when not finalizing. We would also want to document that external
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: I suppose calling `Py_Initialize`, `Py_FinalizeEx`, then `Py_Initialize` again, then `Py_FinalizeEx` again in an embedding application, was already not particularly well supported, since it would leak memory. However, with this change it also leaks threads. That is a bit unfortunate, but I suppose it is just another form of memory leak, and the user can avoid it by ensuring there are no daemon threads (of course even previously, the presence of any daemon threads meant additional memory leaking). -- ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Change by Jeremy Maitin-Shepard : -- keywords: +patch pull_requests: +26916 stage: -> patch review pull_request: https://github.com/python/cpython/pull/28525 ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: It looks like the `_thread` module does not actually expose `PyThread_exit_thread` --- the similarly named `thread_PyThread_exit_thread` just raises SystemExit. >From a search in the codebase, it appears `PyThread_exit_thread` is currently >used only to kill threads when they attempt to acquire the GIL during >finalization. Also, if it is changed to no longer kill the thread, it would probably make sense to rename it, e.g. to `PyThread_stop_thread_during_finalization`. -- ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: Regarding your suggestion of banning daemon threads: I happened to come across this bug not because of daemon threads but because of threads started by C++ code directly that call into Python APIs. The solution I am planning to implement is to add an `atexit` handler to prevent this problem. I do think it is reasonable to suggest that users should ensure daemon threads are exited cleanly via an atexit handler. However, in some cases that may be challenging to implement, and there is also the issue of backward compatibility. -- ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: Regarding your suggestion of adding a hook like `Py_SetThreadExitCallback`, it seems like there are 4 plausible behaviors that such a callback may implement: 1. Abort the process immediately with an error. 2. Exit immediately with the original exit code specified by the user. 3. Hang the thread. 4. Attempt to unwind the thread, like `pthread_exit`, calling pthread thread cleanup functions and C++ destructors. 5. Terminate the thread immediately without any cleanup or C++ destructor calls. The current behavior is (4) on POSIX platforms (`pthread_exit`), and (5) on Windows (`_endthreadex`). In general, achieving a clean shutdown will require the cooperation of all relevant code in the program, particularly code using the Python C API. Commonly the Python C API is used more by library code rather than application code, while it would presumably be the application that is responsible for setting this callback. Writing a library that supports multiple different thread shutdown behaviors would be particularly challenging. I think the callback is useful, but we would still need to discuss what the default behavior should be (hopefully different from the current behavior), and what guidance would be provided as far as what the callback is allowed to do. Option (1) is highly likely to result in a user-visible error --- a lot of Python programs that previously exited successfully will now, possibly only some of the time, exit with an error. The advantage is the user is alerted to the fact that some threads were not cleanly exited, but a lot of previously working code is now broken. This seems like a reasonable policy for a given application to impose (effectively requiring the use of an atexit handler to terminate all daemon threads), but does not seem like a reasonable default given the existing use of daemon threads. Option (2) would likely do the right thing in many cases, but main thread cleanup that was previously run would now be silently skipped. This again seems like a reasonable policy for a given application to impose, but does not seem like a reasonable default. Option (3) avoids the possibility of crashes and memory corruption. Since the thread stack remains allocated, any pointers to the thread stack held in global data structures or by other threads remain valid. There is a risk that the thread may be holding a lock, or otherwise block progress of the main thread, resulting in silent deadlock. That can be mitigated by registering an atexit handler. Option (4) in theory would allow cleanup handlers to be registered in order to avoid deadlock due to locks held. In practice, though, it causes a lot of problems: - The CPython codebase itself contains no such cleanup handlers, and I expect the vast majority of existing C extensionss are also not designed to properly handle the stack unwind triggered by `pthread_exit`. Without proper cleanup handlers, this option reverts to option (5), where there is a risk of memory corruption due to other threads accessing pointers to the freed thread stack. There is also the same risk of deadlock as in option (3). - Stack unwinding interacts particularly badly with common C++ usage because the very first thing most people want to do when using the Python C API from C++ is create a "smart pointer" type for holding a `PyObject` pointer that handles the reference counting automatically (calls `Py_INCREF` when copied, `Py_DECREF` in the destructor). When the stack unwinds due to `pthread_exit`, the current thread will NOT hold the GIL, and these `Py_DECREF` calls result in a crash / memory corruption. We would need to either create a new finalizing-safe version of Py_DECREF, that is a noop when called from a non-main thread if `_Py_IsFinalizing()` is true (and then existing C++ libraries like pybind11 would need to be changed to use it), or modify the existing `Py_DECREF` to always have that additional check. Other calls to Python C APIs in destructors are also common. - When writing code that attempts to be safe in the presence of stack unwinding due to `pthread_exit`, it is not merely explicitly GIL-related calls that are a concern. Virtually any Python C API function can transitively release and acquire the GIL and therefore you must defend against unwind from virtually all Python C API functions. - Some C++ functions in the call stack may unintentionally catch the exception thrown by `pthread_exit` and then return normally. If they return back to a CPython stack frame, memory corruption/crashing is likely. - Alternatively, some C++ functions in the call stack may be marked `noexcept`. If the unwinding reaches such a function, then we end up with option (1). - In general this option seems to require auditing and fixing a very large amount of existing code, and introduces a lot of complexity. For that reasons, I think this option
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Jeremy Maitin-Shepard added the comment: Another possible resolution would to simply make threads that attempt to acquire the GIL after Python starts to finalize hang (i.e. sleep until the process exits). Since the GIL can never be acquired again, this is in some sense the simplest way to fulfill the contract. This also ensures that any data stored on the thread call stack and referenced from another thread remains valid. As long as nothing on the main thread blocks waiting for one of these hung threads, there won't be deadlock. I have a case right now where a background thread (created from C++, which is similar to a daemon Python thread) acquires the GIL, and calls "call_soon_threadsafe" on an asycnio event loop. I think that causes some Python code internally to release the GIL at some point, after triggering some code to run on the main thread which happens to cause the program to exit. While `Py_FinalizeEx` is running, the call to "call_soon_threadsafe" completes on the background thread, attempts to re-acquire the GIL, which triggers a call to pthread_exit. That unwinds the C++ stack, which results in a call to Py_DECREF without the GIL held, leading to a crash. -- nosy: +jbms ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com