David Li created ARROW-16597: -------------------------------- Summary: [Python][FlightRPC] Active server may segfault if Python interpreter shuts down Key: ARROW-16597 URL: https://issues.apache.org/jira/browse/ARROW-16597 Project: Apache Arrow Issue Type: Bug Components: FlightRPC, Python Affects Versions: 8.0.0 Reporter: David Li Assignee: David Li
On Linux, this reliably segfaults for me with {{{}FATAL: exception not rethrown{}}}. Adding a \{[server.shutdown}} to the end fixes it. The reason is that the Python interpreter exits after running the script, and other Python threads [call PyThread_exit_thread|https://github.com/python/cpython/blob/v3.10.4/Python/ceval_gil.h#L221]. But one of the Python threads is currently in the middle of executing the RPC handler. PyThread_exit_thread boils down to pthread_exit which works by throwing an exception that it expects will not be caught. But gRPC places a {{catch(...)}} around RPC handlers and catches this exception, and then pthreads aborts when it doesn't catch the exception. We should force servers to shutdown at exit to avoid this. {code:python} import traceback import pyarrow as pa import pyarrow.flight as flight class Server(flight.FlightServerBase): def do_put(self, context, descriptor, reader, writer): raise flight.FlightCancelledError("foo", extra_info=b"bar") print("PyArrow version:", pa.__version__) server = Server("grpc://localhost:0") client = flight.connect(f"grpc://localhost:{server.port}") schema = pa.schema([]) writer, reader = client.do_put(flight.FlightDescriptor.for_command(b""), schema) try: writer.done_writing() except flight.FlightError as e: traceback.print_exc() print(e.extra_info) except Exception: traceback.print_exc() {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)