[jira] [Reopened] (HADOOP-18706) Improve S3ABlockOutputStream recovery

Steve Loughran (Jira) Wed, 24 May 2023 11:25:06 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-18706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran reopened HADOOP-18706:
-------------------------------------

bad news Chris, had to revert this. 
Can you do a new pr which has very short filenames (ideally span id and some 
minimal info for users but enough so we never run out of filename. thanks

> Improve S3ABlockOutputStream recovery
> -------------------------------------
>
>                 Key: HADOOP-18706
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18706
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>            Reporter: Chris Bevard
>            Assignee: Chris Bevard
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>
> If an application crashes during an S3ABlockOutputStream upload, it's 
> possible to complete the upload if fast.upload.buffer is set to disk by 
> uploading the s3ablock file with putObject as the final part of the multipart 
> upload. If the application has multiple uploads running in parallel though 
> and they're on the same part number when the application fails, then there is 
> no way to determine which file belongs to which object, and recovery of 
> either upload is impossible.
> If the temporary file name for disk buffering included the s3 key, then every 
> partial upload would be recoverable.
> h3. Important disclaimer
> This change does not directly add the Syncable semantics which applications 
> that require {{Syncable.hsync()}} to only return after all pending data has 
> been durably written to the destination path. S3 is not a filesystem and this 
> change does not make it so.
> What is does do is assist anyone trying to implement some post-crash recovery 
> process which
> # interrogates s3 to identofy pending uploads to a specific path and get a 
> list of uploaded blocks yet to be committed
> # scans the local fs.s3a.buffer dir directories to identify in-progress-write 
> blocks for the same target destination. That is those which were being 
> uploaded, queued for uploaded and the single "new data being written to" 
> block for an output stream
> # uploads all those pending blocks
> # generates a new POST to complete a multipart upload with all the blocks in 
> the correct order
> All this patch does is ensure the buffered block filenames include the final 
> path and block ID, to aid in identify which blocks need to be uploaded and 
> what order. 
> h2. warning
> causes HADOOP-18744 -always include the relevant fix when backporting



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Reopened] (HADOOP-18706) Improve S3ABlockOutputStream recovery

Reply via email to