[ https://issues.apache.org/jira/browse/HADOOP-10610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068909#comment-14068909 ]
Aaron T. Myers commented on HADOOP-10610: ----------------------------------------- Hey Steve - sorry about missing your feedback. This JIRA had been in my backlog to review for a while and I honestly didn't notice your comments - I just opened the JIRA and scrolled to the bottom. Honest mistake. Regardless, I think what Ted says makes sense. Seems like S3A will likely supersede S3N, but if you feel strongly we can of course add some tests for what you point out above. I'd be happy to review/commit such a JIRA. > Upgrade S3n s3.fs.buffer.dir to support multi directories > --------------------------------------------------------- > > Key: HADOOP-10610 > URL: https://issues.apache.org/jira/browse/HADOOP-10610 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 > Affects Versions: 2.4.0 > Reporter: Ted Malaska > Assignee: Ted Malaska > Priority: Minor > Fix For: 2.6.0 > > Attachments: HADOOP-10610.patch, HADOOP_10610-2.patch, HDFS-6383.patch > > > s3.fs.buffer.dir defines the tmp folder where files will be written to before > getting sent to S3. Right now this is limited to a single folder which > causes to major issues. > 1. You need a drive with enough space to store all the tmp files at once > 2. You are limited to the IO speeds of a single drive > This solution will resolve both and has been tested to increase the S3 write > speed by 2.5x with 10 mappers on hs1. -- This message was sent by Atlassian JIRA (v6.2#6252)