3dbrows commented on PR #13633:
URL: https://github.com/apache/arrow/pull/13633#issuecomment-1205350024

   @pitrou @lidavidm 
   Thank you for your patience and meticulous review; I very much appreciate 
the friendliness and constructiveness shown here, especially as a first-time 
contributor, and a C++/Cython noob at that.
   
   Progress since last comments:
   
   * Introduced `GetAwsDefaultRetryStrategy()` and 
`GetAwsStandardRetryStrategy()`
     * These still have `static_cast<long>(max_attempts)` because the AWS SDK 
expects `long` - avoidable? (Would imagine `int64_t` and `long` have same 
precision?)
     * These live in `S3RetryStrategy` as static methods.
   
   * Possibly controversial: changing default behaviour of `s3fs` to use AWS 
Standard retry strategy
     * This seems like a more sensible default and should not degrade current 
user experience.
     * What prompted me to start this PR in the first place was the current 
default behaviour of retrying 3x in quick succession, which has suboptimal 
mitigation of `503 SlowDown`; I think using exponential backoff instead by 
default will improve reliability.
   
   * Still to do:
     * Testing.
     * The JIRA ticket ARROW-17057 should be set to in-progress but I do not 
seem to have priveliges to do that.
   
   Questions:
     * Should I be defining `S3RetryStrategy::GetAwsDefaultRetryStrategy` or 
just `GetAwsDefaultRetryStrategy`? (`s3fs.cc`)
     * Need to figure out how to encapsulate `retry_strategy` and 
`retry_max_attempts` nicely (avoid two parameters).
     * How to handle `S3FileSystem._reconstruct()`: need to pass in the retry 
parameters?
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to