[ 
https://issues.apache.org/jira/browse/HDFS-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3564:
------------------------------

    Target Version/s:   (was: 1.1.0)

Hi Sumadhur,

I'm unsetting the target version from 1.1.0 since that release is already under 
way. Btw branch-1 is our sustaining branch, will need to be sure to make sure 
this is compatible / well tested.
                
> Make the replication policy pluggable to allow custom replication policies
> --------------------------------------------------------------------------
>
>                 Key: HDFS-3564
>                 URL: https://issues.apache.org/jira/browse/HDFS-3564
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Sumadhur Reddy Bolli
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> ReplicationTargetChooser currently determines the placement of replicas in 
> hadoop. Making the replication policy pluggable would help in having custom 
> replication policies that suit the environment. 
> Eg1: Enabling placing replicas across different datacenters(not just racks)
> Eg2: Enabling placing replicas across multiple(more than 2) racks
> Eg3: Cloud environments like azure have logical concepts like fault and 
> upgrade domains. Each fault domain spans multiple upgrade domains and each 
> upgrade domain spans multiple fault domains. Machines are spread typically 
> evenly across both fault and upgrade domains. Fault domain failures are 
> typically catastrophic/unplanned failures and data loss possibility is high. 
> An upgrade domain can be taken down by azure for maintenance periodically. 
> Each time an upgrade domain is taken down a small percentage of machines in 
> the upgrade domain(typically 1-2%) are replaced due to disk failures, thus 
> losing data. Assuming the default replication factor 3, any 3 data nodes 
> going down at the same time would mean potential data loss. So, it is 
> important to have a policy that spreads replicas across both fault and 
> upgrade domains to ensure practically no data loss. The problem here is two 
> dimensional and the default policy in hadoop is one-dimensional. Custom 
> policies to address issues like these can be written if we make the policy 
> pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to