[ 
https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165377#comment-14165377
 ] 

Steve Loughran commented on HADOOP-10400:
-----------------------------------------

HADOOP-10714 is a bug that needs fixing, I've promised I'll look at the final 
patch this weekend just to verify that the latest changes work. We should think 
about splitting things up into

# stuff that if we get wrong now is expensive/painful/impossible to fix 
(property names...)
# bugs that need to be fixed before it is usable
# low-risk changes: documentation, more tests.
# features that could be added later without backwards-compatibility problems

then focus on the 1st three, though of course anyone is free to work on #4 too. 
I can't promise any time to review features; I'll try to look at the critical 
bugs when I can (this is a spare time activity for me)



> Incorporate new S3A FileSystem implementation
> ---------------------------------------------
>
>                 Key: HADOOP-10400
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10400
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/s3
>    Affects Versions: 2.4.0
>            Reporter: Jordan Mendelson
>            Assignee: Jordan Mendelson
>             Fix For: 2.6.0
>
>         Attachments: HADOOP-10400-1.patch, HADOOP-10400-2.patch, 
> HADOOP-10400-3.patch, HADOOP-10400-4.patch, HADOOP-10400-5.patch, 
> HADOOP-10400-6.patch, HADOOP-10400-7.patch, HADOOP-10400-8-branch-2.patch, 
> HADOOP-10400-8.patch, HADOOP-10400-branch-2.patch
>
>
> The s3native filesystem has a number of limitations (some of which were 
> recently fixed by HADOOP-9454). This patch adds an s3a filesystem which uses 
> the aws-sdk instead of the jets3t library. There are a number of improvements 
> over s3native including:
> - Parallel copy (rename) support (dramatically speeds up commits on large 
> files)
> - AWS S3 explorer compatible empty directories files "xyz/" instead of 
> "xyz_$folder$" (reduces littering)
> - Ignores s3native created _$folder$ files created by s3native and other S3 
> browsing utilities
> - Supports multiple output buffer dirs to even out IO when uploading files
> - Supports IAM role-based authentication
> - Allows setting a default canned ACL for uploads (public, private, etc.)
> - Better error recovery handling
> - Should handle input seeks without having to download the whole file (used 
> for splits a lot)
> This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to 
> various pom files to get it to build against trunk. I've been using 0.0.1 in 
> production with CDH 4 for several months and CDH 5 for a few days. The 
> version here is 0.0.2 which changes around some keys to hopefully bring the 
> key name style more inline with the rest of hadoop 2.x.
> *Tunable parameters:*
>     fs.s3a.access.key - Your AWS access key ID (omit for role authentication)
>     fs.s3a.secret.key - Your AWS secret key (omit for role authentication)
>     fs.s3a.connection.maximum - Controls how many parallel connections 
> HttpClient spawns (default: 15)
>     fs.s3a.connection.ssl.enabled - Enables or disables SSL connections to S3 
> (default: true)
>     fs.s3a.attempts.maximum - How many times we should retry commands on 
> transient errors (default: 10)
>     fs.s3a.connection.timeout - Socket connect timeout (default: 5000)
>     fs.s3a.paging.maximum - How many keys to request from S3 when doing 
> directory listings at a time (default: 5000)
>     fs.s3a.multipart.size - How big (in bytes) to split a upload or copy 
> operation up into (default: 104857600)
>     fs.s3a.multipart.threshold - Until a file is this large (in bytes), use 
> non-parallel upload (default: 2147483647)
>     fs.s3a.acl.default - Set a canned ACL on newly created/copied objects 
> (private | public-read | public-read-write | authenticated-read | 
> log-delivery-write | bucket-owner-read | bucket-owner-full-control)
>     fs.s3a.multipart.purge - True if you want to purge existing multipart 
> uploads that may not have been completed/aborted correctly (default: false)
>     fs.s3a.multipart.purge.age - Minimum age in seconds of multipart uploads 
> to purge (default: 86400)
>     fs.s3a.buffer.dir - Comma separated list of directories that will be used 
> to buffer file writes out of (default: uses ${hadoop.tmp.dir}/s3a )
> *Caveats*:
> Hadoop uses a standard output committer which uploads files as 
> filename.COPYING before renaming them. This can cause unnecessary performance 
> issues with S3 because it does not have a rename operation and S3 already 
> verifies uploads against an md5 that the driver sets on the upload request. 
> While this FileSystem should be significantly faster than the built-in 
> s3native driver because of parallel copy support, you may want to consider 
> setting a null output committer on our jobs to further improve performance.
> Because S3 requires the file length and MD5 to be known before a file is 
> uploaded, all output is buffered out to a temporary file first similar to the 
> s3native driver.
> Due to the lack of native rename() for S3, renaming extremely large files or 
> directories make take a while. Unfortunately, there is no way to notify 
> hadoop that progress is still being made for rename operations, so your job 
> may time out unless you increase the task timeout.
> This driver will fully ignore _$folder$ files. This was necessary so that it 
> could interoperate with repositories that have had the s3native driver used 
> on them, but means that it won't recognize empty directories that s3native 
> has been used on.
> Statistics for the filesystem may be calculated differently than the s3native 
> filesystem. When uploading a file, we do not count writing the temporary file 
> on the local filesystem towards the local filesystem's written bytes count. 
> When renaming files, we do not count the S3->S3 copy as read or write 
> operations. Unlike the s3native driver, we only count bytes written when we 
> start the upload (as opposed to the write calls to the temporary local file). 
> The driver also counts read & write ops, but they are done mostly to keep 
> from timing out on large s3 operations.
> The AWS SDK unfortunately passes the multipart threshold as an int which means
> fs.s3a.multipart.threshold can not be greater than 2^31-1 (2147483647).
> This is currently implemented as a FileSystem and not a AbstractFileSystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to