[ https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165377#comment-14165377 ]
Steve Loughran commented on HADOOP-10400: ----------------------------------------- HADOOP-10714 is a bug that needs fixing, I've promised I'll look at the final patch this weekend just to verify that the latest changes work. We should think about splitting things up into # stuff that if we get wrong now is expensive/painful/impossible to fix (property names...) # bugs that need to be fixed before it is usable # low-risk changes: documentation, more tests. # features that could be added later without backwards-compatibility problems then focus on the 1st three, though of course anyone is free to work on #4 too. I can't promise any time to review features; I'll try to look at the critical bugs when I can (this is a spare time activity for me) > Incorporate new S3A FileSystem implementation > --------------------------------------------- > > Key: HADOOP-10400 > URL: https://issues.apache.org/jira/browse/HADOOP-10400 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/s3 > Affects Versions: 2.4.0 > Reporter: Jordan Mendelson > Assignee: Jordan Mendelson > Fix For: 2.6.0 > > Attachments: HADOOP-10400-1.patch, HADOOP-10400-2.patch, > HADOOP-10400-3.patch, HADOOP-10400-4.patch, HADOOP-10400-5.patch, > HADOOP-10400-6.patch, HADOOP-10400-7.patch, HADOOP-10400-8-branch-2.patch, > HADOOP-10400-8.patch, HADOOP-10400-branch-2.patch > > > The s3native filesystem has a number of limitations (some of which were > recently fixed by HADOOP-9454). This patch adds an s3a filesystem which uses > the aws-sdk instead of the jets3t library. There are a number of improvements > over s3native including: > - Parallel copy (rename) support (dramatically speeds up commits on large > files) > - AWS S3 explorer compatible empty directories files "xyz/" instead of > "xyz_$folder$" (reduces littering) > - Ignores s3native created _$folder$ files created by s3native and other S3 > browsing utilities > - Supports multiple output buffer dirs to even out IO when uploading files > - Supports IAM role-based authentication > - Allows setting a default canned ACL for uploads (public, private, etc.) > - Better error recovery handling > - Should handle input seeks without having to download the whole file (used > for splits a lot) > This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to > various pom files to get it to build against trunk. I've been using 0.0.1 in > production with CDH 4 for several months and CDH 5 for a few days. The > version here is 0.0.2 which changes around some keys to hopefully bring the > key name style more inline with the rest of hadoop 2.x. > *Tunable parameters:* > fs.s3a.access.key - Your AWS access key ID (omit for role authentication) > fs.s3a.secret.key - Your AWS secret key (omit for role authentication) > fs.s3a.connection.maximum - Controls how many parallel connections > HttpClient spawns (default: 15) > fs.s3a.connection.ssl.enabled - Enables or disables SSL connections to S3 > (default: true) > fs.s3a.attempts.maximum - How many times we should retry commands on > transient errors (default: 10) > fs.s3a.connection.timeout - Socket connect timeout (default: 5000) > fs.s3a.paging.maximum - How many keys to request from S3 when doing > directory listings at a time (default: 5000) > fs.s3a.multipart.size - How big (in bytes) to split a upload or copy > operation up into (default: 104857600) > fs.s3a.multipart.threshold - Until a file is this large (in bytes), use > non-parallel upload (default: 2147483647) > fs.s3a.acl.default - Set a canned ACL on newly created/copied objects > (private | public-read | public-read-write | authenticated-read | > log-delivery-write | bucket-owner-read | bucket-owner-full-control) > fs.s3a.multipart.purge - True if you want to purge existing multipart > uploads that may not have been completed/aborted correctly (default: false) > fs.s3a.multipart.purge.age - Minimum age in seconds of multipart uploads > to purge (default: 86400) > fs.s3a.buffer.dir - Comma separated list of directories that will be used > to buffer file writes out of (default: uses ${hadoop.tmp.dir}/s3a ) > *Caveats*: > Hadoop uses a standard output committer which uploads files as > filename.COPYING before renaming them. This can cause unnecessary performance > issues with S3 because it does not have a rename operation and S3 already > verifies uploads against an md5 that the driver sets on the upload request. > While this FileSystem should be significantly faster than the built-in > s3native driver because of parallel copy support, you may want to consider > setting a null output committer on our jobs to further improve performance. > Because S3 requires the file length and MD5 to be known before a file is > uploaded, all output is buffered out to a temporary file first similar to the > s3native driver. > Due to the lack of native rename() for S3, renaming extremely large files or > directories make take a while. Unfortunately, there is no way to notify > hadoop that progress is still being made for rename operations, so your job > may time out unless you increase the task timeout. > This driver will fully ignore _$folder$ files. This was necessary so that it > could interoperate with repositories that have had the s3native driver used > on them, but means that it won't recognize empty directories that s3native > has been used on. > Statistics for the filesystem may be calculated differently than the s3native > filesystem. When uploading a file, we do not count writing the temporary file > on the local filesystem towards the local filesystem's written bytes count. > When renaming files, we do not count the S3->S3 copy as read or write > operations. Unlike the s3native driver, we only count bytes written when we > start the upload (as opposed to the write calls to the temporary local file). > The driver also counts read & write ops, but they are done mostly to keep > from timing out on large s3 operations. > The AWS SDK unfortunately passes the multipart threshold as an int which means > fs.s3a.multipart.threshold can not be greater than 2^31-1 (2147483647). > This is currently implemented as a FileSystem and not a AbstractFileSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)