Re: HBase Replication use cases

lars hofhansl Thu, 12 Apr 2012 15:37:32 -0700

Himanshu,

please keep digging, though. This is will mission critical for us, and we'll be 
testing this heavily.
If you find anything strange, by all means file a jira, squashing bugs here is 
critical.



-- Lars


----- Original Message -----
From: lars hofhansl <[email protected]>
To: "[email protected]" <[email protected]>
Cc: 
Sent: Thursday, April 12, 2012 3:12 PM
Subject: Re: HBase Replication use cases

I think it's like J-D said. stop_replication is a kill switch.
In 0.94+ we have start/stop_peer which suspends replication, but still keeps 
track of the logs to replicate.


It would complicate the code a lot (IMHO) to start replicating from partial 
logs or to roll each and every log and then consider replication started only 
after the last log was rolled.


----- Original Message -----
From: Jesse Yates <[email protected]>
To: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Sent: Thursday, April 12, 2012 2:56 PM
Subject: Re: HBase Replication use cases



On Apr 12, 2012, at 2:50 PM, lars hofhansl <[email protected]> wrote:

> Thanks Himanshu,
> 
> we're planning to use Replication for cross DC replication for DR (and we 
> added a bunch of stuff and fixed bugs in replication).
> 
> 
> We'll have it always on (and only use stop/start_peer, which is new in 0.94+ 
> to temporarily stop replication, rather than stop/start_replication)
> HBASE-2611 is a problem. We did not have time recently to work on this.
> 
> i) and ii) can be worked around by forcing a log roll on all region servers 
> after replication was enabled. Replication would be considered started after 
> the logs were
> rolled... But that is quite annoying.
> 

Should we consider adding this as part of the replication code proper? Is there 
a smarter way to go about it?

- Jesse 
> Is iii) still a problem in 0.92+? I thought we fixed that together with a).
> 
> -- Lars
> 
> ________________________________
> From: Himanshu Vashishtha <[email protected]>
> To: [email protected] 
> Sent: Thursday, April 12, 2012 12:11 PM
> Subject: HBase Replication use cases
> 
> Hello All,
> 
> I have been doing testing on the HBase replication (0.90.4, and 0.92 
> variants).
> 
> Here are some of the findings:
> 
> a) 0.90+ is not that great in handling out znode changes; in an
> ongoing replication, if I delete a peer and a region server goes to
> the znode to update the log status, the region server aborts itself
> when it sees a missing znode.
> 
> Recoverable Zookeeper seems to have fix this in 0.92+?
> 
> 0.92 has lot of new features (start/stop handle, master master, cyclic).
> 
> But there are corner cases with the start/stop switches.
> i)  A log is en-queue when the replication state is set to true. When we
> start the cluster, it is true and the starting region server takes the
> new log into the queue. If I do a stop_replication, and there is a log
> roll, and then I do a start_replication, the current log will not be
> replicated, as it has missed the opportunity of being added to the queue.
> 
> ii) If I _start_ a region server when the replication state is set to
> false, its log will not be added to the queue. Now, if I do a
> start_replication, its log will not be replicated.
> 
> iii) Removing a peer doesn't result in master region server abort, but
> in case of zk is down and there is a log roll, it will abort. Not a
> serious one as zk is down so the cluster is not healthy anyway.
> 
> I was looking for jiras (including 2611), and stumbled upon 2223. I
> don't think there is any thing like time based partition behavior (as
> mentioned in the jira description). Though. the patch has lot of other
> nice things which indeed are in existing code. Please correct me if I
> miss  anything.
> 
> Having said that, I wonder about other folks out there use it.
> Their experience, common issues (minor + major) they come across.
> I did find a ppt by Jean Daniel at oscon mentioning about using it in
> SU production.
> 
> I plan to file jiras for the above ones and will start digging in.
> 
> Look forward for your responses.
> 
> Thanks,
> Himanshu

Re: HBase Replication use cases

Reply via email to