[ https://issues.apache.org/jira/browse/HDFS-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920171#action_12920171 ]
Jeff Hammerbacher commented on HDFS-1432: ----------------------------------------- bq. BTW, what is a BCP cluster? http://en.wikipedia.org/wiki/Business_continuity_planning > HDFS across data centers: HighTide > ---------------------------------- > > Key: HDFS-1432 > URL: https://issues.apache.org/jira/browse/HDFS-1432 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: dhruba borthakur > Assignee: dhruba borthakur > > There are many instances when the same piece of data resides on multiple HDFS > clusters in different data centers. The primary reason being that the > physical limitation of one data center is insufficient to host the entire > data set. In that case, the administrator(s) typically partition that data > into two (or more) HDFS clusters on two different data centers and then > duplicates some subset of that data into both the HDFS clusters. > In such a situation, there will be six physical copies of data that is > duplicated, three copies in one data center and another three copies in > another data center. It would be nice if we can keep fewer than 3 replicas on > each of the data centers and have the ability to fix a replica in the local > data center by copying data from the remote copy in the remote data center. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.