I have a local S3 compatible object store (using ceph) and am trying to use
the pyarrow fs interface. This seems to work well except on larger objects
I am getting unhandled exceptions. Is there a way to currently tune the
timeouts or retries? Here is the kind of code and error I am seeing:
from pyarrow import fs
s3 =
fs.S3FileSystem(access_key=my_ak,secret_key=my_sk,endpoint_override=my_endpoint,scheme='http')
raw = s3.open_input_stream('test_bucket/example_key').readall()
File "pyarrow/_fs.pyx", line 621, in
pyarrow._fs.FileSystem.open_input_stream
File "pyarrow/error.pxi", line 122, in
pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status
OSError: When reading information for key 'example_key' in bucket
'test_bucket': AWS Error [code 99]: curlCode: 28, Timeout was reached
--
install details:
python: python 3.8.6
OS: linux, redhat 7.7
pyarrow version: 3.0.0
thanks for the help,
Luke