[ https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246835#comment-14246835 ]
Konstantin Boudnik commented on HDFS-5442: ------------------------------------------ bq. MapR's approach to DR is perhaps the best in the Hadoop world right now. MapR-FS takes snapshots and replicates those snapshots to the other site. It's hardly the best, because the snapshots are by definition aren't real-time, so your DR side is always behind of the primary. And in case of a disastrous event you're going to loose not-yet-snapshot'ed data or data-in-flight. > Zero loss HDFS data replication for multiple datacenters > -------------------------------------------------------- > > Key: HDFS-5442 > URL: https://issues.apache.org/jira/browse/HDFS-5442 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Avik Dey > Assignee: Dian Fu > Attachments: Disaster Recovery Solution for Hadoop.pdf, Disaster > Recovery Solution for Hadoop.pdf, Disaster Recovery Solution for Hadoop.pdf > > > Hadoop is architected to operate efficiently at scale for normal hardware > failures within a datacenter. Hadoop is not designed today to handle > datacenter failures. Although HDFS is not designed for nor deployed in > configurations spanning multiple datacenters, replicating data from one > location to another is common practice for disaster recovery and global > service availability. There are current solutions available for batch > replication using data copy/export tools. However, while providing some > backup capability for HDFS data, they do not provide the capability to > recover all your HDFS data from a datacenter failure and be up and running > again with a fully operational Hadoop cluster in another datacenter in a > matter of minutes. For disaster recovery from a datacenter failure, we should > provide a fully distributed, zero data loss, low latency, high throughput and > secure HDFS data replication solution for multiple datacenter setup. > Design and code for Phase-1 to follow soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)