Re: Thoughts on VM fence infrastructure

Daniel P . Berrangé Tue, 01 Oct 2019 04:11:06 -0700

On Tue, Oct 01, 2019 at 10:46:24AM +0000, Felipe Franciosi wrote:
> Hi Daniel!
> 
> 
> > On Oct 1, 2019, at 11:31 AM, Daniel P. Berrangé <berra...@redhat.com> wrote:
> > 
> > On Tue, Oct 01, 2019 at 09:56:17AM +0000, Felipe Franciosi wrote:
> 
> (Apologies for the mangled URL, nothing I can do about that.) :(
> 
> There are several points which favour adding this to Qemu:
> - Not all environments use systemd.


Sure, if you want to cope with that you can just use the HW watchdog
directly instead of via systemd. 

> - HW watchdogs always reboot the host, which is too drastic.
> - You may not want to protect all VMs in the same way.

Same points repeated below, so I'll respond there....

> > IMHO doing this at the host OS level is going to be more reliable in
> > terms of detecting the problem in the first place, as well as more
> > reliable in taking the action - its very difficult for a hardware CPU
> > reset to fail to work.
> 
> Absolutely, but it's a very drastic measure that:
> - May be unnecessary.

Of course, the inability to predict future consequences is what
forces us into assuming the worst case & taking actions to
mitigate that. It will definitely result in unccessary killing
of hosts, but that is what gives you the safety guarantees you
can't otherwise achieve.

I gave the example elsewhere that even if you kill QEMU, the kernel
can have pending I/O associated with QEMU that can be sent if the
host later recovers.

> - Will fence everything even perhaps only some VMs need protection.

I don't believe its viable to have offer real protection to only
a subset of VMs, principally because the kernel is doing I/O work
on behalf of the VM, so to protect just 1 VM you must fence the
kernel.

> What are your thoughts on this 3-level approach?
> 1) Qemu tries to log() + abort() (deadline)

Just abort()'ing isn't going to be a viable strategy with QEMU's move
towards a multi-process architecture. This introduces the problem that
the "main" QEMU process has to enumerate all the helpers it is dealing
with and kill them all off in some way. This is non-trivial especially
if some of the helpers are running under different privilege levels.

You could declare that multi-process QEMU is out of scope, but I think
QEMU self-fencing would need to offer compelling benefits over host OS
self-fencing to justify that exception. Personally I'm not seeing it.

> 2) Kernel sends SIGKILL (harddeadline)

This is slightly easier to deal with multiple processes in that it
isn't restricted by the privileges of the main QEMU vs helpers and
could take advantage of cgroups perhaps.

> 3) HW watchdog kicks in (harderdeadline)


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: Thoughts on VM fence infrastructure

Reply via email to