[ https://issues.apache.org/jira/browse/HADOOP-16792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022944#comment-17022944 ]
Steve Loughran commented on HADOOP-16792: ----------------------------------------- committed to trunk h2. Mustafa's scale test report from the github PR I ran ITestS3AHugeFilesDiskBlocks#test_010_CreateHugeFile with some combinations. The first experiments used default file size and partition for huge files. I set request timeout to 1 ms for the first experiment. Test file system failed to initialize. This is because verifyBuckets call in the beginning times out repeteadly. This is retried within AWS sdk code up to `com.amazonaws.ClientConfiguration#maxErrorRetry` times. This value is configurable from Hadoop side via property `fs.s3a.attempts.maximum`. All of this retries are opaque to Hadoop. At the end of this retry cycle, aws sdk returns failure to Hadoop's Invoker. Then, Invoker evaluates whether to retry this operation or not according to its configured retry policies. I saw that verifyBuckets call were not retried on Invoker level. In a followup experiment, I set request timeout to 200ms, which is enough for verifyBuckets call to succeed but short enough that multi part uploads fail. In these cases, again AWS sdk retries these http requests up to `maxErrorRetry` times. After this http request fails `maxErrorRetry` times, Invoker's retry mechanism kicks in. I observed Invoker to retry these operations up to `fs.s3a.retry.limit` times conforming to configured exponential back-off limited retry policy. After all these `fs.s3a.retry.limit`*`maxErrorRetry` retries, Invoker bubbles up AWSClientIOException to the user as shown below: {code} org.apache.hadoop.fs.s3a.AWSClientIOException: upload part on tests3ascale/disk/hugefile: com.amazonaws.SdkClientException: Unable to execute HTTP request: Request did not complete before the request timeout configuration.: Unable to execute HTTP request: Request did not complete before the request timeout configuration. at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:205) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:112) at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:315) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:407) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:311) {code} Later, I ran the test with 256M file size and 32M partitionsize. I set the request timeout to 5s. My goal was to introduce a few retries due to short request timeout, but complete the upload operation with the use of retries. I managed to do that. I saw some retries due to short request timeout, but they were retried and the upload operation completed successfully. The test failed anyway because it also expected that `TRANSFER_PART_FAILED_EVENT` be 0. This is obviously not the case because some transfers failed but they were retried. I checked S3 and verified that the file was there. I also verified that temporary partition files were cleared in my local drive. When I run the same experiment with 8GB file and 128M partitions but with small request timeout, the test fails due to uploads not being completed. I also ran a soak test with 8GB files with a large request timeout. This passed fine as expected because timeout value was high enough to let uploads complete. > Let s3 clients configure request timeout > ---------------------------------------- > > Key: HADOOP-16792 > URL: https://issues.apache.org/jira/browse/HADOOP-16792 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.3.0 > Reporter: Mustafa Iman > Assignee: Mustafa Iman > Priority: Major > > S3 does not guarantee latency. Every once in a while a request may straggle > and drive latency up for the greater procedure. In these cases, simply > timing-out the individual request is beneficial so that the client > application can retry. The retry tends to complete faster than the original > straggling request most of the time. Others experienced this issue too: > [https://arxiv.org/pdf/1911.11727.pdf] . > S3 configuration already provides timeout facility via > `ClientConfiguration#setTimeout`. Exposing this configuration is beneficial > for latency sensitive applications. S3 client configuration is shared with > DynamoDB client which is also affected from unreliable worst case latency. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org