Vadym Dytyniak created ARROW-18228:
--------------------------------------

             Summary: AWS Error SLOW_DOWN during PutObject operation
                 Key: ARROW-18228
                 URL: https://issues.apache.org/jira/browse/ARROW-18228
             Project: Apache Arrow
          Issue Type: Bug
    Affects Versions: 10.0.0
            Reporter: Vadym Dytyniak


We use Dask to parallelise read/write operations and pyarrow to write dataset 
from worker nodes.

After pyarrow released version 10.0.0, our data flows automatically switched to 
the latest version and some of them started to fail with the following error:
{code:java}
File "/usr/local/lib/python3.10/dist-packages/org/store/storage.py", line 768, 
in _write_partition
    ds.write_dataset(
  File "/usr/local/lib/python3.10/dist-packages/pyarrow/dataset.py", line 988, 
in write_dataset
    _filesystemdataset_write(
  File "pyarrow/_dataset.pyx", line 2859, in 
pyarrow._dataset._filesystemdataset_write
    check_status(CFileSystemDataset.Write(c_options, c_scanner))
  File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
    raise IOError(message)
OSError: When creating key 'equities.us.level2.by_security/' in bucket 
'org-prod': AWS Error SLOW_DOWN during PutObject operation: Please reduce your 
request rate. {code}
Do you have any idea what was changed for dataset write between 9.0.0 and 
10.0.0 to help us to fix the issue?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to