[ 
https://issues.apache.org/jira/browse/HADOOP-19347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19347:
------------------------------------
    Summary: S3A: AWS SDK deleteObjects() and S3Store.deleteObjects() don't 
handle 500 failures of individual objects  (was: AWS SDK deleteObjects() and 
S3Store.deleteObjects() don't handle 500 failures of individual objects)

> S3A: AWS SDK deleteObjects() and S3Store.deleteObjects() don't handle 500 
> failures of individual objects
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19347
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19347
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.4.1
>            Reporter: Steve Loughran
>            Priority: Minor
>
> S3Store.deleteObjects() encountered 500 error and didn't recover.
> We normally assume that 500 errors are already retried by the SDK so our own 
> retry logic doesn't bother
> The root cause is that the 500 errors can surface within the bulk delete.
> * The delete POST returns 200, so SDK is happy
> * but one of the rows in the request is reports the S3Error "InternalError":
> {{Code=InternalError, Message=We encountered an internal error. Please try 
> again.)]}}
> Proposed.
> * bulk delete invoker must map "InternalError" to AWSStatus500Exception and 
> throw that.
> * Add a retry policy for bulk deletes which considers AWSStatus500Exception 
> as retriable. retry. We currently don't on the assumption that the SDK will 
> retry, which it does for base retries, but clearly not for multiobject delete.
> * Maybe also consider possibility that a partial 503 response could be 
> generated? that is: only part of the delete throttled?
> {code}
> Caused by: org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException: 
> [S3Error(Key=table/warehouse/tablespace/external/hive/table/-tmp.-ext-10000/file/,
>  Code=InternalError, Message=We encountered an internal error. Please try 
> again.)]
>       at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:3186)
>       at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeysS3(S3AFileSystem.java:3422)
>       at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:3481)
>       at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$OperationCallbacksImpl.removeKeys(S3AFileSystem.java:2558)
>       at 
> org.apache.hadoop.fs.s3a.impl.RenameOperation.lambda$removeSourceObjects$3(RenameOperation.java:625)
>       at org.apache.hadoop.fs.s3a.Invoker.lambda$once$0(Invoker.java:165)
>       at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
>   
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to