[issue42969] pthread_exit & PyThread_exit_thread are harmful

2021-01-19 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

C-APIs such as `PyEval_RestoreThreads()` are insufficient for the task they are 
asked to do.  They return void, yet have a failure mode.

They call pthread_exit() on failure today.

Instead, they need to return an error to the calling application to indicate 
that "The Python runtime is no longer available."

Callers need to act on that in whatever way is most appropriate to them.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42969] pthread_exit & PyThread_exit_thread are harmful

2021-01-19 Thread Gregory P. Smith


New submission from Gregory P. Smith :

## BACKGROUND

`PyThread_exit_thread()` calls `pthread_exit(`) and is in turn called from a 
variety of APIs as documented in the C-API doc update from 
https://bugs.python.org/issue36427.

The `pthread_exit()` call was originally introduced as a way "resolve" a 
crashes from daemon threads during shutdown https://bugs.python.org/issue1856.  
It did that.  That fix to that even landed in 2.7.8 but was rolled back before 
2.7.9 due to a bug in an existing application it exposed at the time (did we 
miss the implications of that? maybe).  It remained in 3.x.

## PROBLEM

`pthread_exit()` cannot be used blindly by any application.  All code in the 
threaded application needs to be on board with it and always prepared for any 
API call they make to potentially lead to thread termination.  Quoting a 
colleague: "pthread_exit() does not result in stack unwind or local variable 
destruction".  This means that any code up the stack from the ultimate 
`pthread_exit()` call that has process-wide state implications that did not go 
out of its way to register cleanup with `pthread_cleanup_push()` could lead to 
deadlocks or lost resources. Something implausible to assume that code does.

We're seeing this happen with C/C++ code.  Our C++ builds with 
`-fno-exceptions` so uncatchable exception based stack unwinding as some 
pthread_exit implementations may trigger does not happen (and cannot be 
guaranteed anyways, see bpo-42888).  Said C/C++ code is calling back into 
Python from a thread and thus must use `PyEval_RestoreThread()` or similar APIs 
before performing Python C API calls.  If the interpreter is being finalized 
from another thread... these enter a codepath that ultimately calls 
`pthread_exit()` leaving corrupt state in the process.  In this case that 
unexpected thread simply disappearing can lead to a deadlock in our process.

Fundamentally I do not believe the CPython VM should ever call `pthread_exit()` 
when non-CPython frames are anywhere in the C stack.  This may mean we should 
never call `pthread_exit()` at all (unsure; but it'd be ideal).

The documentation suggests that all callers in user code of the four C-APIs 
with the documented `pthread_exit()` caveats need auditing and pre-call 
`_Py_IsFinalizing()` API checks.  But... I do not believe that would fix 
anything even if it were done.  `_Py_IsFinalizing()` called without the GIL 
held means that it could change by the time the `PyEval_RestoreThreads()` API 
calls it internally do determine if it should exit the thread.  Thus the race 
condition window would merely be narrowed, not eliminated.  Not good enough.

## CURRENT WORKAROUND (Big Hammer)

Change CPython to call abort() instead of pthread_exit() as that situation is 
unresolvable and the process dying is better than hanging, partially alive.  
That solution isn't friendly, but is better than being silent and allowing 
deadlock.  A failing process is always better than a hung process, especially a 
partially hung process.

## SEMI RELATED WORK

https://bugs.python.org/issue42888 - appears to be avoiding some 
`PyThread_exit_thread()` calls to stop some crashes due to `libgcc_s` being 
loaded on demand upon thread exit.

--
components: Interpreter Core
keywords: 3.2regression
messages: 385288
nosy: gregory.p.smith
priority: normal
severity: normal
status: open
title: pthread_exit & PyThread_exit_thread are harmful
type: behavior
versions: Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com