Re: Replication hosed after simple cluster restart

Stack Wed, 13 Mar 2013 18:43:59 -0700

Not sure I follow.  Is this our making use of multi against a zk ensemble
that doesn't support it?
On Mar 13, 2013 6:22 PM, "lars hofhansl" <[email protected]> wrote:


> I suppose the problem could be in
> zkHelper.copyQueuesFromRSUsingMulti(rsZnode) as called from
> ReplicationSourceManager.NodeFailoverWorker.run().
> copyQueuesFromRSUsingMulti will return the queues it read even when the
> multi operation failed (because another RS managed to execute it first).
>
> -- Lars
>
>
>
> ________________________________
>  From: lars hofhansl <[email protected]>
> To: hbase-dev <[email protected]>
> Sent: Wednesday, March 13, 2013 6:12 PM
> Subject: Replication hosed after simple cluster restart
>
> We just ran into an interesting scenario. We restarted a cluster that was
> setup as a replication source.
> The stop went cleanly.
>
> Upon restart *all* regionservers aborted within a few seconds with
> variations of these errors:
> http://pastebin.com/3iQVuBqS
>
> This is scary!
>
> -- Lars

Re: Replication hosed after simple cluster restart

Reply via email to