adamreeve commented on issue #46930: URL: https://github.com/apache/arrow/issues/46930#issuecomment-3101618688
@alketm put me in touch with some of the developers on the CRT team who pointed out there are some configuration options specific to the S3-CRT client that could be exposed to allow configuring its behaviour. For uploads there's a [`multipart_upload_threshold`](https://github.com/awslabs/aws-c-s3/blob/36e2c3717a65ada2049602e147553999babdac4d/include/aws/s3/s3_client.h#L486) option. And for both uploads and downloads, the [`part_size`](https://github.com/awslabs/aws-c-s3/blob/36e2c3717a65ada2049602e147553999babdac4d/include/aws/s3/s3_client.h#L468) option is relevant. I think exposing these would be useful so users can tune them for the best performance on their systems. There's also some documentation on how downloads are handled that's useful to know: https://github.com/awslabs/aws-c-s3/blob/main/docs/images/GetObjectFlow.svg I will aim to update my branch soon to switch the S3 FileSystem over to the S3-CRT client rather than having both implementations available, but I probably don't have a lot of time to test and benchmark this so will likely need some help with that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org