thisisnic commented on issue #50009:
URL: https://github.com/apache/arrow/issues/50009#issuecomment-4556701731

   I had Claude take a look at this, and here's its summary/suggestions. This 
is a bit beyond my C++ knowledge; @jonkeane - does this look about right to you?
   
   ---
   
   This is a known issue with the AWS C++ SDK. During `Aws::ShutdownAPI()`, the 
SDK calls `curl_easy_cleanup` on pooled HTTP handles. If those connections have 
gone stale (S3 closes idle connections after [a few 
seconds](https://repost.aws/questions/QU9abzqZn6R7K3KNqJ7EiKww/s3-idle-connection-timeout)),
 curl attempts an SSL shutdown handshake on dead sockets, which triggers 
SIGPIPE 
([aws/aws-sdk-cpp#1685](https://github.com/aws/aws-sdk-cpp/issues/1685), 
[aws/aws-sdk-cpp#1220](https://github.com/aws/aws-sdk-cpp/issues/1220)).
   
   The SDK has an `installSigPipeHandler` option for exactly this, but it 
defaults to `false` because signal handlers are process-global. There's an 
[open issue arguing it should default to 
true](https://github.com/aws/aws-sdk-cpp/issues/2323) — the Velox team hit the 
same problem.
   
   Arrow initialises S3 via `EnsureS3Initialized()` with defaults, so 
`installSigPipeHandler` is off. On top of that, the R binding wraps the call in 
`StopIfNotOk()` 
([`r/src/filesystem.cpp:355`](https://github.com/apache/arrow/blob/main/r/src/filesystem.cpp#L355)),
 which does a `longjmp` via `cpp11::stop()` if the shutdown returns non-OK — 
interrupting the teardown mid-way and leaving the SDK half-destroyed, causing 
the segfault.
   
   **Suggested fix (two parts):**
   
   1. Replace `StopIfNotOk(fs::FinalizeS3())` with a warning so shutdown 
completes even if there are errors (longjmping out of `Aws::ShutdownAPI()` is 
never safe)
   2. Enable `installSigPipeHandler = true` when R initialises S3, so the SDK 
handles SIGPIPE gracefully during cleanup


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to