[ https://issues.apache.org/jira/browse/HDFS-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413155#comment-13413155 ]
Harsh J commented on HDFS-3564: ------------------------------- bq. I will re-purpose this JIRA to suggest enhancements to the existing abstraction. Given that HDFS-3649 was just opened for backport work, can you at least re-title the JIRA to fit this re-purpose goal? Avoids confusion for some of us. Thanks! :) > Make the replication policy pluggable to allow custom replication policies > -------------------------------------------------------------------------- > > Key: HDFS-3564 > URL: https://issues.apache.org/jira/browse/HDFS-3564 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: Sumadhur Reddy Bolli > Original Estimate: 24h > Remaining Estimate: 24h > > ReplicationTargetChooser currently determines the placement of replicas in > hadoop. Making the replication policy pluggable would help in having custom > replication policies that suit the environment. > Eg1: Enabling placing replicas across different datacenters(not just racks) > Eg2: Enabling placing replicas across multiple(more than 2) racks > Eg3: Cloud environments like azure have logical concepts like fault and > upgrade domains. Each fault domain spans multiple upgrade domains and each > upgrade domain spans multiple fault domains. Machines are spread typically > evenly across both fault and upgrade domains. Fault domain failures are > typically catastrophic/unplanned failures and data loss possibility is high. > An upgrade domain can be taken down by azure for maintenance periodically. > Each time an upgrade domain is taken down a small percentage of machines in > the upgrade domain(typically 1-2%) are replaced due to disk failures, thus > losing data. Assuming the default replication factor 3, any 3 data nodes > going down at the same time would mean potential data loss. So, it is > important to have a policy that spreads replicas across both fault and > upgrade domains to ensure practically no data loss. The problem here is two > dimensional and the default policy in hadoop is one-dimensional. Custom > policies to address issues like these can be written if we make the policy > pluggable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira