Re: [Qemu-devel] [PATCH COLO v3 01/14] docs: block replication's description

Dr. David Alan Gilbert Fri, 08 May 2015 02:57:08 -0700

* Kevin Wolf (kw...@redhat.com) wrote:
> Am 08.05.2015 um 10:42 hat Stefan Hajnoczi geschrieben:
> > On Tue, May 05, 2015 at 04:23:56PM +0100, Dr. David Alan Gilbert wrote:
> > > * Stefan Hajnoczi (stefa...@redhat.com) wrote:
> > > > On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote:
> > > > > 
> > > > > 
> > > > > On 24/04/2015 11:38, Wen Congyang wrote:
> > > > > >> > 
> > > > > >> > That can be done with drive-mirror.  But I think it's too early 
> > > > > >> > for that.
> > > > > > Do you mean use drive-mirror instead of quorum?
> > > > > 
> > > > > Only before starting up a new secondary.  Basically you do a migration
> > > > > with non-shared storage, and then start the secondary in colo mode.
> > > > > 
> > > > > But it's only for the failover case.  Quorum (or a new block/colo.c
> > > > > driver or filter) is fine for normal colo operation.
> > > > 
> > > > Perhaps this patch series should mirror the Secondary's disk to a Backup
> > > > Secondary so that the system can be protected very quickly after
> > > > failover.
> > > > 
> > > > I think anyone serious about fault tolerance would deploy a Backup
> > > > Secondary, otherwise the system cannot survive two failures unless a
> > > > human administrator is lucky/fast enough to set up a new Secondary.
> > > 
> > > I'd assumed that a higher level management layer would do the allocation
> > > of a new secondary after the first failover, so no human need be involved.
> > 
> > That doesn't help, after the first failover is too late even if it's
> > done by a program.  There should be no window during which the VM is
> > unprotected.
> > 
> > People who want fault tolerance care about 9s of availability.  The VM
> > must be protected on the new Primary as soon as the failover occurs,
> > otherwise this isn't a serious fault tolerance solution.
> 
> If you're worried about two failures in a row, why wouldn't you be
> worried about three in a row? I think if you really want more than one
> backup to be ready, you shouldn't go to two, but to n.


Agreed, if you did multiple secondaries you'd do 'n'.

But 1+2 does satisfy all but the most paranoid; and in particular it does
mean that if you want to take a host down for some maintenance you can
do it without worrying.

But, as I said in my reply to Stefan, doing more than 1+1 gets really hairy;
the combinations of failovers are much more complicated.

Dave
  1) It means that 
  1) As Stefan mentions you get worried about the lack of protection after
the first failover; 
> Kevin


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH COLO v3 01/14] docs: block replication's description

Reply via email to