thisisnic commented on issue #50009: URL: https://github.com/apache/arrow/issues/50009#issuecomment-4556701731
I had Claude take a look at this, and here's its summary/suggestions. This is a bit beyond my C++ knowledge; @jonkeane - does this look about right to you? --- This is a known issue with the AWS C++ SDK. During `Aws::ShutdownAPI()`, the SDK calls `curl_easy_cleanup` on pooled HTTP handles. If those connections have gone stale (S3 closes idle connections after [a few seconds](https://repost.aws/questions/QU9abzqZn6R7K3KNqJ7EiKww/s3-idle-connection-timeout)), curl attempts an SSL shutdown handshake on dead sockets, which triggers SIGPIPE ([aws/aws-sdk-cpp#1685](https://github.com/aws/aws-sdk-cpp/issues/1685), [aws/aws-sdk-cpp#1220](https://github.com/aws/aws-sdk-cpp/issues/1220)). The SDK has an `installSigPipeHandler` option for exactly this, but it defaults to `false` because signal handlers are process-global. There's an [open issue arguing it should default to true](https://github.com/aws/aws-sdk-cpp/issues/2323) — the Velox team hit the same problem. Arrow initialises S3 via `EnsureS3Initialized()` with defaults, so `installSigPipeHandler` is off. On top of that, the R binding wraps the call in `StopIfNotOk()` ([`r/src/filesystem.cpp:355`](https://github.com/apache/arrow/blob/main/r/src/filesystem.cpp#L355)), which does a `longjmp` via `cpp11::stop()` if the shutdown returns non-OK — interrupting the teardown mid-way and leaving the SDK half-destroyed, causing the segfault. **Suggested fix (two parts):** 1. Replace `StopIfNotOk(fs::FinalizeS3())` with a warning so shutdown completes even if there are errors (longjmping out of `Aws::ShutdownAPI()` is never safe) 2. Enable `installSigPipeHandler = true` when R initialises S3, so the SDK handles SIGPIPE gracefully during cleanup -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
