[ 
https://issues.apache.org/jira/browse/HADOOP-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864651#comment-17864651
 ] 

Steve Loughran commented on HADOOP-19221:
-----------------------------------------

full sanitized stack. First failure was a 500, followup calls failed with 400 
(and one openssl connection closed error)
{code}
An error occurred .: org.apache.hadoop.fs.s3a.AWSStatus500Exception: Completing 
multipart upload on <something>.snappy.orc: 
software.amazon.awssdk.services.s3.model.S3Exception: We encountered an 
internal error. Please try again. (Service: S3, Status Code: 500, Request ID: 
ID1, Extended Request ID: X1:InternalError: We encountered an internal error. 
Please try again. (Service: S3, Status Code: 500, Request ID: ID1, Extended 
Request ID: X1
        at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:340)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
        at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
        at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
        at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.finalizeMultipartUpload(WriteOperationHelper.java:327)
        at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.commitUpload(WriteOperationHelper.java:580)
        at 
org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.innerCommit(CommitOperations.java:243)
        at 
org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.lambda$commit$3(CommitOperations.java:214)
        at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543)
        at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524)
        at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445)
        at 
org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.commit(CommitOperations.java:213)
        at 
org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.commitOrFail(CommitOperations.java:190)
        at 
org.apache.hadoop.fs.s3a.commit.impl.CommitContext.commitOrFail(CommitContext.java:254)
        at 
org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter.lambda$loadAndCommit$5(AbstractS3ACommitter.java:766)
        at 
org.apache.hadoop.util.functional.TaskPool$Builder.lambda$runParallel$0(TaskPool.java:410)
        at 
org.apache.hadoop.fs.s3a.commit.impl.CommitContext$PoolSubmitter.lambda$submit$0(CommitContext.java:464)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: We encountered 
an internal error. Please try again. (Service: S3, Status Code: 500, Request 
ID: ID1, Extended Request ID: X1
        at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
        at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
        at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
        at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
        at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
        at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38)
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at 
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
        at 
software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
        at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
        at 
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224)
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
        at 
software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
        at 
software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
        at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
        at 
software.amazon.awssdk.services.s3.DefaultS3Client.completeMultipartUpload(DefaultS3Client.java:727)
        at 
org.apache.hadoop.fs.s3a.S3AFileSystem$WriteOperationHelperCallbacksImpl.completeMultipartUpload(S3AFileSystem.java:1896)
        at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$finalizeMultipartUpload$1(WriteOperationHelper.java:334)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
        ... 21 more     Suppressed: 
software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 
failure: We encountered an internal error. Please try again. (Service: S3, 
Status Code: 500, Request ID: ID2, Extended Request ID: X2)
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: 
Request attempt 2 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID3, Extended Request ID: X3)
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: 
Request attempt 3 failure: Unable to execute HTTP request: WFOPENSSL0035 Stream 
is closed Suppressed: software.amazon.awssdk.core.exception.SdkClientException: 
Request attempt 4 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID4, Extended Request ID: X4)
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: 
Request attempt 5 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID5, Extended Request ID: X5)
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: We encountered 
an internal error. Please try again. (Service: S3, Status Code: 500, Request 
ID: ID1, Extended Request ID: X1
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request 
attempt 1 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID2, Extended Request ID: X2)
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request 
attempt 2 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID3, Extended Request ID: X3)
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request 
attempt 3 failure: Unable to execute HTTP request: WFOPENSSL0035 Stream is 
closedSuppressed: software.amazon.awssdk.core.exception.SdkClientException: 
Request attempt 4 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID4, Extended Request ID: X4)
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request 
attempt 5 failure: We encountered an internal error. Please try again. 
(Service: S3, Status Code: 500, Request ID: ID5, Extended Request ID: X5)

{code}


> S3A: Unable to recover from error of multipart block upload.
> ------------------------------------------------------------
>
>                 Key: HADOOP-19221
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19221
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>
> If a multipart PUT request fails for some reason (e.g. networrk error) then 
> all subsequent retry attempts fail with a 400 Response and ErrorCode 
> RequestTimeout .
> {code}
> Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended 
> Request ID:
> {code}
> The list of supporessed exceptions contains the root cause (the initial 
> failure was a 500); all retries failed to upload properly from the source 
> input stream {{RequestBody.fromInputStream(fileStream, size)}}.
> Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 
> sdk we would build a multipart block upload request passing in (file, offset, 
> length), the way we are now doing this doesn't recover.
> probably fixable by providing our own {{ContentStreamProvider}} 
> implementations for
> # file + offset + length
> # bytebuffer
> # byte array
> The sdk does have explicit support for the memory ones, but they copy the 
> data blocks first. we don't want that as it would double the memory 
> requirements of active blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to