[
https://issues.apache.org/jira/browse/HADOOP-19734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran resolved HADOOP-19734.
-------------------------------------
Resolution: Invalid
this is actually me making a mess of checksum config
if the sdk checksum clalculation is set to "always" then the user MUST choose a
checksum algorithm for s3 uploads (proposed: CRC32).
I"m going to leave checksum calculation off by default for performance and
compatibility
> S3A: retry on MPU completion failure "One or more of the specified parts
> could not be found"
> --------------------------------------------------------------------------------------------
>
> Key: HADOOP-19734
> URL: https://issues.apache.org/jira/browse/HADOOP-19734
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.2
> Environment: aws s3 london
> Reporter: Steve Loughran
> Priority: Minor
>
> Experienced transient failure in test run of
> https://github.com/apache/hadoop/pull/7882 : all MPU complete posts failed
> because the request or parts were not found...the tests started succeeding
> 60-90s later *and* a "hadoop s3guards uploads" call listed the outstanding
> uploads of the failing tests.
> Hypothesis: a transient failure meant the server receiving the POST calls to
> complete the uploads was mistakenly reporting no upload IDs.
> Outcome: all active write operations failed, without any retry attempts. This
> can lose data and fail jobs, even though the store may recover.
> Proposed. The multipart uploads, especially block output stream, retry on
> this error; treat it as a connectivity issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]