[ https://issues.apache.org/jira/browse/HADOOP-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17864651#comment-17864651 ]
Steve Loughran commented on HADOOP-19221: ----------------------------------------- full sanitized stack. First failure was a 500, followup calls failed with 400 (and one openssl connection closed error) {code} An error occurred .: org.apache.hadoop.fs.s3a.AWSStatus500Exception: Completing multipart upload on <something>.snappy.orc: software.amazon.awssdk.services.s3.model.S3Exception: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID1, Extended Request ID: X1:InternalError: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID1, Extended Request ID: X1 at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:340) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) at org.apache.hadoop.fs.s3a.WriteOperationHelper.finalizeMultipartUpload(WriteOperationHelper.java:327) at org.apache.hadoop.fs.s3a.WriteOperationHelper.commitUpload(WriteOperationHelper.java:580) at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.innerCommit(CommitOperations.java:243) at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.lambda$commit$3(CommitOperations.java:214) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445) at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.commit(CommitOperations.java:213) at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.commitOrFail(CommitOperations.java:190) at org.apache.hadoop.fs.s3a.commit.impl.CommitContext.commitOrFail(CommitContext.java:254) at org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter.lambda$loadAndCommit$5(AbstractS3ACommitter.java:766) at org.apache.hadoop.util.functional.TaskPool$Builder.lambda$runParallel$0(TaskPool.java:410) at org.apache.hadoop.fs.s3a.commit.impl.CommitContext$PoolSubmitter.lambda$submit$0(CommitContext.java:464) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: software.amazon.awssdk.services.s3.model.S3Exception: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID1, Extended Request ID: X1 at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93) at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279) at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74) at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53) at software.amazon.awssdk.services.s3.DefaultS3Client.completeMultipartUpload(DefaultS3Client.java:727) at org.apache.hadoop.fs.s3a.S3AFileSystem$WriteOperationHelperCallbacksImpl.completeMultipartUpload(S3AFileSystem.java:1896) at org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$finalizeMultipartUpload$1(WriteOperationHelper.java:334) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122) ... 21 more Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID2, Extended Request ID: X2) Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID3, Extended Request ID: X3) Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 3 failure: Unable to execute HTTP request: WFOPENSSL0035 Stream is closed Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 4 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID4, Extended Request ID: X4) Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 5 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID5, Extended Request ID: X5) Caused by: software.amazon.awssdk.services.s3.model.S3Exception: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID1, Extended Request ID: X1 Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID2, Extended Request ID: X2) Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID3, Extended Request ID: X3) Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 3 failure: Unable to execute HTTP request: WFOPENSSL0035 Stream is closedSuppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 4 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID4, Extended Request ID: X4) Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 5 failure: We encountered an internal error. Please try again. (Service: S3, Status Code: 500, Request ID: ID5, Extended Request ID: X5) {code} > S3A: Unable to recover from error of multipart block upload. > ------------------------------------------------------------ > > Key: HADOOP-19221 > URL: https://issues.apache.org/jira/browse/HADOOP-19221 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > > If a multipart PUT request fails for some reason (e.g. networrk error) then > all subsequent retry attempts fail with a 400 Response and ErrorCode > RequestTimeout . > {code} > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended > Request ID: > {code} > The list of supporessed exceptions contains the root cause (the initial > failure was a 500); all retries failed to upload properly from the source > input stream {{RequestBody.fromInputStream(fileStream, size)}}. > Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 > sdk we would build a multipart block upload request passing in (file, offset, > length), the way we are now doing this doesn't recover. > probably fixable by providing our own {{ContentStreamProvider}} > implementations for > # file + offset + length > # bytebuffer > # byte array > The sdk does have explicit support for the memory ones, but they copy the > data blocks first. we don't want that as it would double the memory > requirements of active blocks. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org