[ 
https://issues.apache.org/jira/browse/HDFS-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920171#action_12920171
 ] 

Jeff Hammerbacher commented on HDFS-1432:
-----------------------------------------

bq. BTW, what is a BCP cluster?

http://en.wikipedia.org/wiki/Business_continuity_planning

> HDFS across data centers: HighTide
> ----------------------------------
>
>                 Key: HDFS-1432
>                 URL: https://issues.apache.org/jira/browse/HDFS-1432
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> There are many instances when the same piece of data resides on multiple HDFS 
> clusters in different data centers.  The primary reason being that the 
> physical limitation of one data center is insufficient to host the entire 
> data set. In that case, the administrator(s) typically partition that data 
> into two  (or more) HDFS clusters on two different data centers and then 
> duplicates some subset of that data into both the HDFS clusters.
> In such a situation, there will be six physical copies of data that is 
> duplicated, three copies in one data center and another three copies in 
> another data center. It would be nice if we can keep fewer than 3 replicas on 
> each of the data centers and have the ability to fix a replica in the local 
> data center by copying data from the remote copy in the remote data center.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to