[ 
https://issues.apache.org/jira/browse/HDFS-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13851219#comment-13851219
 ] 

Jerry Chen commented on HDFS-5442:
----------------------------------

{quote}It might be good to break up the work into two major features.{quote}
Logically, yes. And just as you mentioned, the user will have the flexibility 
to choose between sync or async features based on their needs. On the other 
hand, in design perspective, the two features share some common concepts and 
facilities, and serves common requirement of cross datacenter replication. We 
also see the needs of sync replication and async replication to be used at the 
same time and complement each other for different data characteristics.

{quote}There seems to be assumption of replication of entire namespace at few 
places. This might not be desirable in many cases. Enabling this feature per 
directory or list of directories would be very useful.{quote}
As the namespace replication is based on namespace journaling, to replicate the 
entire namespace is in concept straightforward and simple. Per list of 
directories namespace replication does can be done by filtering, but that would 
complex the whole thing as we know that the edit logs for a directory doesn’t 
form a closure in namespace journaling. On the other hand, the data plays a 
critical role for cross datacenter replication. User can configure a list of 
directories for synchronous replication of data and other directories data will 
be replicated asynchronously. We will target the entire namespace replication 
in the phase-1 work and can consider this in phase-2 work when we understanding 
the exact impact for partial namespace replication. 
{quote}There seems to be assumption of primary cluster and secondary cluster. 
Can this be chained to having something A->B and B->C. Or even the use case of 
A->B or B->A. Calling out those with configuration options would be very useful 
for cluster admins.{quote}
In the design, secondary and primary cluster are operating differently. To 
support chain like A->B and B->C, a secondary cluster should act as a primary 
cluster for C. This needs extra work specific for chaining to be done. I would 
suggest consider this as future improvement. When we talk about chain cluster, 
I would tend to consider it as asynchronous replication. This would simplify 
things a little. While reverse/switch the primary and secondary cluster role is 
supported but this doesn’t mean two way replication at the same time.
{quote}Another place which would need more information is about primary cluster 
NN tracking datanode information from secondary cluster (via secondary cluster 
NN). This needs to be thought to see if this is really scalable.{quote}
We should assume that the part of datanode information tracked by primary 
cluster is kept as minimum. And this information is updated in batch via 
secondary cluster NN. In network communication, our goal is to send the 
secondary cluster details when there is really a change in DN state and batches 
wise.  For example, DN expires with secondary cluster or DN space completely 
filled and cannot write any new data to it, that time we report this DNs. We 
skip reporting DNs which are already registered and they are still qualifies 
for writes. Let’s communicate by using patches as to other details of “how to”.
{quote}How would ReplicationManager or changing replication of files work in 
general with this policy?{quote}
In the high level, we would assume the original replication in each local 
cluster is still working as it was. The concept of the original replication 
number is applied to the local blocks only. The added part is remote block 
replication which is triggered by secondary cluster NameNode. 

> Zero loss HDFS data replication for multiple datacenters
> --------------------------------------------------------
>
>                 Key: HDFS-5442
>                 URL: https://issues.apache.org/jira/browse/HDFS-5442
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Avik Dey
>         Attachments: Disaster Recovery Solution for Hadoop.pdf
>
>
> Hadoop is architected to operate efficiently at scale for normal hardware 
> failures within a datacenter. Hadoop is not designed today to handle 
> datacenter failures. Although HDFS is not designed for nor deployed in 
> configurations spanning multiple datacenters, replicating data from one 
> location to another is common practice for disaster recovery and global 
> service availability. There are current solutions available for batch 
> replication using data copy/export tools. However, while providing some 
> backup capability for HDFS data, they do not provide the capability to 
> recover all your HDFS data from a datacenter failure and be up and running 
> again with a fully operational Hadoop cluster in another datacenter in a 
> matter of minutes. For disaster recovery from a datacenter failure, we should 
> provide a fully distributed, zero data loss, low latency, high throughput and 
> secure HDFS data replication solution for multiple datacenter setup.
> Design and code for Phase-1 to follow soon.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to