Hi Lars:

I'm using hbase 0.92.1 and Hadoop 1.0.1.
Yes, you are right. I'm replicating from cluster A to cluster B only with 
cyclic replication configured. Eventually I will test replicating cluster A to 
cluster B and vice versa with high intensive write workload but if this 
replication doesn't work for one way, we need to think about other solutions. 

No data loss in cluster A for sure. 

Best Regards,

Jerry 

Sent from my iPad

On 2012-04-20, at 15:34, lars hofhansl <lhofha...@yahoo.com> wrote:

> Hi Jerry,
> 
> which version of HBase are you using?
> 
> You are not using cyclic backup, that needs >2 clusters. I assume you're just 
> replicating from one cluster to another, right?
> 
> There is never data loss in Cluster A?
> 
> -- Lars
> 
> 
> ----- Original Message -----
> From: Jerry Lam <chiling...@gmail.com>
> To: user@hbase.apache.org
> Cc: 
> Sent: Friday, April 20, 2012 5:38 AM
> Subject: HBase Cyclic Replication Issue: some data are missing in the 
> replication for intensive write
> 
> Hi HBase community:
> 
> We have been testing cyclic replication for 1 week. The basic functionality 
> seems to work as described in the document however when we started to 
> increase the write workload, the replication starts to miss data (i.e. some 
> data are not replicated to the other cluster). We have narrowed down to a 
> scenario that we can reproduce the problem quite consistently and here it is:
> 
> -----------------------------
> Setup:
> - We have setup 2 clusters (cluster A and cluster B)with identical size in 
> terms of number of nodes and configuration, 3 regionservers sit on top of 3 
> datanodes. 
> - Cyclic replication is enabled.
> 
> - We use YCSB to generate load to hbase the workload is very similar to 
> workloada:
> 
> recordcount=200000
> operationcount=200000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> fieldcount=1
> fieldlength=25000
> 
> readallfields=true
> writeallfields=true
> 
> readproportion=0
> updateproportion=1
> scanproportion=0
> insertproportion=0
> 
> requestdistribution=uniform
> 
> - Records are inserted into Cluster A. After the benchmark is done and wait 
> until all data are replicated to Cluster B, we used verifyrep mapreduce job 
> for validation.
> - Data are deleted from both table (truncate 'tablename') before a new 
> experiment is started.
> 
> Scenario:
> when we increase the number of threads until it max out the throughput of the 
> cluster, we saw some data are missing in Cluster B (total count != 200000) 
> although cluster A clearly has them all. This happens even though we disabled 
> region splitting in both clusters (it happens more often when region splits 
> occur). To further having more control of what is happening, we then decided 
> to disable the load balancer so the region (which is responsible for the 
> replicating data) will not relocate to other regionserver during the 
> benchmark. The situation improves a lot. We don't see any missing data in 5 
> continuous runs. Finally, we decided to move the region around from a 
> regionserver to another regionserver during the benchmark to see if the 
> problem will reappear and it did. 
> 
> We believe that the issue could be related to region splitting and load 
> balancing during intensive write, the hbase replication strategy hasn't yet 
> cover those corner cases. 
> 
> Can someone take a look of it and suggest some ways to workaround this? 
> 
> Thanks~
> 
> Jerry 

Reply via email to