[ 
https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706351#comment-15706351
 ] 

Steve Loughran edited comment on HADOOP-13786 at 11/29/16 7:51 PM:
-------------------------------------------------------------------

For the curious, my WiP, in the s3guard branch, is here: 
https://github.com/steveloughran/hadoop/tree/s3guard/HADOOP-13786-committer

I've been tweaking the mapreduce code to add ability to switch to a subclass of 
FileOutputCommitter in FileOutputFormat (And transitively, all children), by 
way of a factory. The S3a factory dynamically chooses the committer based on 
the destination FS. There's currently no difference in the codebase apart from 
logging of operation durations. This means that we can switch the committer 
behind all output formats to using the s3a one (and also allowing anyone else 
to do the same with their own FOF committer subclass). 

I can see that ParquetOutputFormat has its own committer, 
{{ParquetOutputCommitter}}, as used by {{ParquetOutputFormat}} and so, I 
believe spark's {{dataframe.parquet.write()}} pipeline; my IDE isn't 
highlighting much else.

Now, could I get away with just modifying the base FOF committer?


for exception reporting it could use {{FileSystem.rename(final Path src, final 
Path dst, final Rename... options)}}; S3A to impl that with exceptions. That's 
the "transitory" method for use between FS and FileContext, which does raise 
exceptions; the default is non-atomic and eventually gets to {{rename(src, 
dest)}}, except for HDFS which does a full implementation. But: what if the 
semantics of that rename() (not yet in the FS spec, AFAIK) are different from 
what callers expect in some of the corner cases? And so commits everywhere 
break? That would not be good. And it would remove the ability to use any 
private s3a/s3guard calls if we wanted some (e.g. get a lock on a directory), 
unless they went in to the standard FS APIs.

Making the output factory pluggable would avoid such problems, though logging 
duration might be nice all round. That said, given the time to rename, its 
somewhat unimportant when using real filesystems.





was (Author: ste...@apache.org):
For the curious, my WiP, in the s3guard branch, is here: 
https://github.com/steveloughran/hadoop/tree/s3guard/HADOOP-13786-committer

I've been tweaking the mapreduce code to add ability to switch to a subclass of 
FileOutputCommitter in FileOutputFormat (And transitively, all children), by 
way of a factory. The S3a factory dynamically chooses the committer based on 
the destination FS. There's currently no difference in the codebase apart from 
logging of operation durations. This means that we can switch the committer 
behind all output formats to using the s3a one (and also allowing anyone else 
to do the same with their own FOF committer subclass). 

I can see that ParquetOutputFormat has its own committer, 
{{ParquetOutputCommitter}}, as used by {{ParquetOutputFormat}} and so, I 
believe spark's {{dataframe.parquet.write()}} pipeline; my IDE isn't 
highlighting much else.

Now, could I get away with just modifying the base FOF committer?


for exception reporting it could use {{FileSystem.rename(final Path src, final 
Path dst, final Rename... options)}}; S3A to impl that with exceptions. That's 
the "transitory" method for use between FS and FileContext, which does raise 
exceptions; the default is non-atomic and eventually gets to {{rename(src, 
dest)}}, except for HDFS which does a full implementation. But: what if the 
semantics of that rename() (not yet in the FS spec, AFAIK) are different from 
what callers expect in some of the corner cases? And so commits everywhere 
break? That would not be good. And it would remove the ability to use any 
private s3a/s3guard calls if we wanted some (e.g. get a lock on a directory), 
unless they went in to the standard FS APIs.

Making the output factory pluggable would avoid such problems, though logging 
duration might be nice all round. That said, given the time to rename, its 
somewhat unimportant when using real filesystems.




> add output committer which uses s3guard for consistent O(1) commits to S3
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>
> A goal of this code is "support O(1) commits to S3 repositories in the 
> presence of failures". Implement it, including whatever is needed to 
> demonstrate the correctness of the algorithm. (that is, assuming that s3guard 
> provides a consistent view of the presence/absence of blobs, show that we can 
> commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output 
> streams (ie. not visible until the close()), if we need to use that to allow 
> us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to