wjones127 opened a new issue, #93:
URL: https://github.com/apache/arrow-rs-object-store/issues/93

   **Describe the bug**
   
   Instead of waiting until the data passed to the writer is uploaded to return 
ready, we buffer it until there is enough data and then put the request future 
in `FuturesUnordered`. I thought this was pretty clever when I originally wrote 
it, but it has a big flaw: If a lot of time passes between write calls, the 
request futures can go all that time without being polled. This can cause 
timeouts such as:
   
   ```
   Generic S3 error: Error after 0 retries in 67.379407043s, max_retries:10, 
retry_timeout:180s, source:error sending request for url (s3://...): operation 
timed out
   ```
   
   **To Reproduce**
   
   I discovered this downstream in some code that uses multi part uploads: 
https://github.com/lancedb/lance/issues/1878#issuecomment-1928748932
   
   **Expected behavior**
   
   I think the best solution is to run the upload tasks inside of a background 
task created with `tokio::task::spawn()`.
   
   **Additional context**
   
   It's quite possible this is the same underlying cause of 
https://github.com/apache/arrow-rs/issues/5106


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to