steveloughran commented on issue #1814: HADOOP-16823. Manage S3 Throttling 
exclusively in S3A client.
URL: https://github.com/apache/hadoop/pull/1814#issuecomment-579910544
 
 
   
   
   My "little" fix to turn off retries in the AWS client causes issues in the 
DDB clients where there's a significant mismatch between prepaid IO and load; 
ITestDynamoDBMetadataStoreScale is the example of this. 
   
   
   Looking at the AWS metrics, part of the fun is that the way bursty traffic 
is handled, you may get your capacity at the time of the initial load, but get 
blocked after. That is: the throttling may not happen under load, but during 
the next time a low-load API call is made.
   
   
   Also, S3GuardTableAccess isn't retrying, and some code in tests and the 
purge/dump table entry points go on to fail when throttling happens when 
iterating through scans. Fix: you can ask a DDBMetastore to wrap your scan with 
one bonded to its retry and metrics...plus use of this where appropriate.
   
   ITestDynamoDBMetadataStoreScale is really slow; either the changes make it 
worse, or its always been really slow and we haven't noticed as it was 
happening during the (slow) parallel test runs. Proposed: we review it, look at 
what we want to show and then see if we can make things fail faster
   
   Latest Patch makes the SDK throttling disablement exclusive to S3, fixed up 
DDB clients to retry better and tries to make a better case for that 
ITestDynamoDBMetadataStoreScale suite.
   
   I think I'm going to tune those tests to always downgrade if none is 
detected. 
   
   ```
   [INFO] -------------------------------------------------------
   [INFO] Running 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale
   [ERROR] Tests run: 11, Failures: 5, Errors: 1, Skipped: 0, Time elapsed: 
190.404 s <<< FAILURE! - in 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale
   [ERROR] 
test_030_BatchedWrite(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale)
  Time elapsed: 10.259 s  <<< FAILURE!
   java.lang.AssertionError: No throttling detected in Tracker with read 
throttle events = 0; write throttles = 0; batch throttles = 0; scan throttles = 
0 against DynamoDBMetadataStore{region=eu-west-1, tableName=s3guard-metadata, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/s3guard-metadata}
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.execute(ITestDynamoDBMetadataStoreScale.java:578)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.test_030_BatchedWrite(ITestDynamoDBMetadataStoreScale.java:285)
   
   [ERROR] 
test_040_get(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale)  
Time elapsed: 4.15 s  <<< FAILURE!
   java.lang.AssertionError: No throttling detected in Tracker with read 
throttle events = 0; write throttles = 0; batch throttles = 0; scan throttles = 
0 against DynamoDBMetadataStore{region=eu-west-1, tableName=s3guard-metadata, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/s3guard-metadata}
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.execute(ITestDynamoDBMetadataStoreScale.java:578)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.test_040_get(ITestDynamoDBMetadataStoreScale.java:341)
   
   [ERROR] 
test_050_getVersionMarkerItem(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale)
  Time elapsed: 3.311 s  <<< FAILURE!
   java.lang.AssertionError: No throttling detected in Tracker with read 
throttle events = 0; write throttles = 0; batch throttles = 0; scan throttles = 
0 against DynamoDBMetadataStore{region=eu-west-1, tableName=s3guard-metadata, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/s3guard-metadata}
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.execute(ITestDynamoDBMetadataStoreScale.java:578)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.test_050_getVersionMarkerItem(ITestDynamoDBMetadataStoreScale.java:356)
   
   [ERROR] 
test_070_putDirMarker(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale)
  Time elapsed: 2.486 s  <<< ERROR!
   org.apache.hadoop.fs.s3a.AWSServiceThrottledException: getVersionMarkerItem 
on ../VERSION: 
com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException: 
The level of configured provisioned throughput for the table was exceeded. 
Consider increasing your provisioning level with the UpdateTable API. (Service: 
AmazonDynamoDBv2; Status Code: 400; Error Code: 
ProvisionedThroughputExceededException; Request ID: 
52JGLGQ7B8SLQD3BDQCI9U6NH3VV4KQNSO5AEMVJF66Q9ASUAAJG): The level of configured 
provisioned throughput for the table was exceeded. Consider increasing your 
provisioning level with the UpdateTable API. (Service: AmazonDynamoDBv2; Status 
Code: 400; Error Code: ProvisionedThroughputExceededException; Request ID: 
52JGLGQ7B8SLQD3BDQCI9U6NH3VV4KQNSO5AEMVJF66Q9ASUAAJG)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.createMetadataStore(ITestDynamoDBMetadataStoreScale.java:153)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.setup(ITestDynamoDBMetadataStoreScale.java:163)
   Caused by: 
com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException: 
The level of configured provisioned throughput for the table was exceeded. 
Consider increasing your provisioning level with the UpdateTable API. (Service: 
AmazonDynamoDBv2; Status Code: 400; Error Code: 
ProvisionedThroughputExceededException; Request ID: 
52JGLGQ7B8SLQD3BDQCI9U6NH3VV4KQNSO5AEMVJF66Q9ASUAAJG)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.createMetadataStore(ITestDynamoDBMetadataStoreScale.java:153)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.setup(ITestDynamoDBMetadataStoreScale.java:163)
   
   [ERROR] 
test_090_delete(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale)
  Time elapsed: 2.804 s  <<< FAILURE!
   java.lang.AssertionError: No throttling detected in Tracker with read 
throttle events = 0; write throttles = 0; batch throttles = 0; scan throttles = 
0 against DynamoDBMetadataStore{region=eu-west-1, tableName=s3guard-metadata, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/s3guard-metadata}
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.execute(ITestDynamoDBMetadataStoreScale.java:578)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.test_090_delete(ITestDynamoDBMetadataStoreScale.java:462)
   
   [ERROR] 
test_100_forgetMetadata(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale)
  Time elapsed: 2.278 s  <<< FAILURE!
   java.lang.AssertionError: No throttling detected in Tracker with read 
throttle events = 0; write throttles = 0; batch throttles = 0; scan throttles = 
0 against DynamoDBMetadataStore{region=eu-west-1, tableName=s3guard-metadata, 
tableArn=arn:aws:dynamodb:eu-west-1:980678866538:table/s3guard-metadata}
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.execute(ITestDynamoDBMetadataStoreScale.java:578)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStoreScale.test_100_forgetMetadata(ITestDynamoDBMetadataStoreScale.java:478)
   
   ```
   
   For the setup failure (here in  test_070_putDirMarker); not sure. We either 
skip the test or retry. 
   
   It's always surfacing in test_070; test_060 tests list scale. Looking at 
that code, I think the retry logic is too coarse -it retries the entire list, 
when we may want to just retry on the hasnext/next calls. That is: push it 
down. This will avoid so much load on any retry.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to