[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

Steve Loughran (JIRA) Fri, 10 Mar 2017 13:41:41 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905743#comment-15905743
 ]


Steve Loughran commented on HADOOP-13786:
-----------------------------------------

# {{FileCommitActions}} is my code. Thomas: blame me there. Sanity check by the 
look of things. Ryan's committer has a near-identical persistent data structure 
of pending commit information, different commit codepath. It can move over to 
{{FileCommitActions}} once {{PersistentCommitData}} also supports ser/deser of 
a list of commits to a (JSON) file, rather than just one file/pending commit
# we are going to have to move sight of the S3Client away from the committers 
entirely, so that s3guard can stay in sync with what's happening. Otherwise a 
caller can use the client to complete the put, but s3guard won't know to update 
its tables. {{S3aFileSystem.WriteOperationHelper}} has everything needed, or, 
if it doesn't, can add the rest. I've not gone near that yet as getting tests 
working comes ahead of getting the integration complete.
# Again, threadpools: no real opinion. The bulk uploads should be assigned to 
the "slow operations pool" on the FS and if we have a "faster ops pool", all 
the abort/commit calls in there. I do like that whole Threads design BTW, very 
nice, sliding in to a java 8+ world nicely.

> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-HADOOP-13345-001.patch, 
> HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, 
> HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, 
> HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, 
> HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, 
> HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, 
> s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

Reply via email to