[ 
https://issues.apache.org/jira/browse/HADOOP-19734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19734.
-------------------------------------
    Resolution: Invalid


this is actually me making a mess of checksum config

if the sdk checksum clalculation is set to "always" then the user MUST choose a 
checksum algorithm for s3 uploads (proposed: CRC32). 

I"m going to leave checksum calculation off by default for performance and 
compatibility

> S3A: retry on MPU completion failure "One or more of the specified parts 
> could not be found"
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-19734
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19734
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.2
>         Environment: aws s3 london
>            Reporter: Steve Loughran
>            Priority: Minor
>
> Experienced transient failure in test run of 
> https://github.com/apache/hadoop/pull/7882 : all MPU complete posts failed 
> because the request or parts were not found...the tests started succeeding 
> 60-90s later *and* a "hadoop s3guards uploads" call listed the outstanding 
> uploads of the failing tests.
> Hypothesis: a transient failure meant the server receiving the POST calls to 
> complete the uploads was mistakenly reporting no upload IDs.
> Outcome: all active write operations failed, without any retry attempts. This 
> can lose data and fail jobs, even though the store may recover.
> Proposed. The multipart uploads, especially block output stream, retry on 
> this error; treat it as a connectivity issue. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to