[jira] [Commented] (HDFS-4420) Provide a way to exclude subtree from balancing process
[ https://issues.apache.org/jira/browse/HDFS-4420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163583#comment-14163583 ] Ted Malaska commented on HDFS-4420: --- What is the status on this Jira. I am interested in helping if it has stalled. > Provide a way to exclude subtree from balancing process > --- > > Key: HDFS-4420 > URL: https://issues.apache.org/jira/browse/HDFS-4420 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover >Reporter: Max Lapan >Priority: Minor > Attachments: Balancer-exclude-subtree-0.90.2.patch, > Balancer-exclude-trunk-v2.patch, Balancer-exclude-trunk-v3.patch, > Balancer-exclude-trunk.patch, HDFS-4420-v4.patch > > > During balancer operation, it balances all blocks, regardless of their > filesystem hierarchy. Sometimes, it would be usefull to exclude some subtree > from balancing process. > For example, regionservers data locality is cruical for HBase performance. > Region's data is tied to regionservers, which reside on specific machines in > cluster. During operation, regionservers reads and writes region's data, and > after some time, all this data are reside on local machine, so, all reads > become local, which is great for performance. Balancer breaks this locality > during opertation by moving blocks around. > This patch adds [-exclude ] switch, and, if path is provided, > balancer will not move blocks under this path during operation. > Attached patch have tested for 0.90.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-6383) Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
Ted Malaska created HDFS-6383: - Summary: Upgrade S3n s3.fs.buffer.dir to suppoer multi directories Key: HDFS-6383 URL: https://issues.apache.org/jira/browse/HDFS-6383 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ted Malaska Priority: Minor s3.fs.buffer.dir defines the tmp folder where files will be written to before getting sent to S3. Right now this is limited to a single folder which causes to major issues. 1. You need a drive with enough space to store all the tmp files at once 2. You are limited to the IO speeds of a single drive This solution will resolve both and has been tested to increase the S3 write speed by 2.5x with 10 mappers on hs1. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6383) Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
[ https://issues.apache.org/jira/browse/HDFS-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HDFS-6383: -- Attachment: HDFS-6383.patch This solves the problem. Thanks to: Joe Prosser, Dave Wang, Andrei Savu and Govind Kamat for testing > Upgrade S3n s3.fs.buffer.dir to suppoer multi directories > - > > Key: HDFS-6383 > URL: https://issues.apache.org/jira/browse/HDFS-6383 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ted Malaska >Priority: Minor > Attachments: HDFS-6383.patch > > > s3.fs.buffer.dir defines the tmp folder where files will be written to before > getting sent to S3. Right now this is limited to a single folder which > causes to major issues. > 1. You need a drive with enough space to store all the tmp files at once > 2. You are limited to the IO speeds of a single drive > This solution will resolve both and has been tested to increase the S3 write > speed by 2.5x with 10 mappers on hs1. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6383) Upgrade S3n s3.fs.buffer.dir to suppoer multi directories
[ https://issues.apache.org/jira/browse/HDFS-6383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HDFS-6383: -- Status: Patch Available (was: Open) Patch is ready > Upgrade S3n s3.fs.buffer.dir to suppoer multi directories > - > > Key: HDFS-6383 > URL: https://issues.apache.org/jira/browse/HDFS-6383 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ted Malaska >Priority: Minor > Attachments: HDFS-6383.patch > > > s3.fs.buffer.dir defines the tmp folder where files will be written to before > getting sent to S3. Right now this is limited to a single folder which > causes to major issues. > 1. You need a drive with enough space to store all the tmp files at once > 2. You are limited to the IO speeds of a single drive > This solution will resolve both and has been tested to increase the S3 write > speed by 2.5x with 10 mappers on hs1. -- This message was sent by Atlassian JIRA (v6.2#6252)