* Kevin Wolf (kw...@redhat.com) wrote: > Am 08.05.2015 um 10:42 hat Stefan Hajnoczi geschrieben: > > On Tue, May 05, 2015 at 04:23:56PM +0100, Dr. David Alan Gilbert wrote: > > > * Stefan Hajnoczi (stefa...@redhat.com) wrote: > > > > On Fri, Apr 24, 2015 at 11:36:35AM +0200, Paolo Bonzini wrote: > > > > > > > > > > > > > > > On 24/04/2015 11:38, Wen Congyang wrote: > > > > > >> > > > > > > >> > That can be done with drive-mirror. But I think it's too early > > > > > >> > for that. > > > > > > Do you mean use drive-mirror instead of quorum? > > > > > > > > > > Only before starting up a new secondary. Basically you do a migration > > > > > with non-shared storage, and then start the secondary in colo mode. > > > > > > > > > > But it's only for the failover case. Quorum (or a new block/colo.c > > > > > driver or filter) is fine for normal colo operation. > > > > > > > > Perhaps this patch series should mirror the Secondary's disk to a Backup > > > > Secondary so that the system can be protected very quickly after > > > > failover. > > > > > > > > I think anyone serious about fault tolerance would deploy a Backup > > > > Secondary, otherwise the system cannot survive two failures unless a > > > > human administrator is lucky/fast enough to set up a new Secondary. > > > > > > I'd assumed that a higher level management layer would do the allocation > > > of a new secondary after the first failover, so no human need be involved. > > > > That doesn't help, after the first failover is too late even if it's > > done by a program. There should be no window during which the VM is > > unprotected. > > > > People who want fault tolerance care about 9s of availability. The VM > > must be protected on the new Primary as soon as the failover occurs, > > otherwise this isn't a serious fault tolerance solution. > > If you're worried about two failures in a row, why wouldn't you be > worried about three in a row? I think if you really want more than one > backup to be ready, you shouldn't go to two, but to n.
Agreed, if you did multiple secondaries you'd do 'n'. But 1+2 does satisfy all but the most paranoid; and in particular it does mean that if you want to take a host down for some maintenance you can do it without worrying. But, as I said in my reply to Stefan, doing more than 1+1 gets really hairy; the combinations of failovers are much more complicated. Dave 1) It means that 1) As Stefan mentions you get worried about the lack of protection after the first failover; > Kevin -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK