Re: AW: Manual fence KVM Host

Murilo Moura Mon, 15 Apr 2024 10:10:03 -0700

Correct, the user instance and the computing offer have HA enabled.

regards,


Murilo Moura


On Mon, Apr 15, 2024 at 4:00 AM <[email protected]> wrote:

> HI Murilo,
>
> just checking, the user instance you are talking about are using a service
> offering with HA enabled, correct?
>
> Regards,
> Swen
>
> -----Ursprüngliche Nachricht-----
> Von: Murilo Moura <[email protected]>
> Gesendet: Sonntag, 14. April 2024 06:31
> An: [email protected]
> Betreff: Re: AW: Manual fence KVM Host
>
> Hello Guto!
>
>
> I carefully checked the instructions that you and Daniel left in this
> thread that I opened, but one point is not working and I would like to see
> if you have experienced something similar...
>
> By putting the host in the "Disconnected" state, I can trigger the API to
> mark the host as degraded, so far everything is ok. Right after this action
> I see that the system VMs are recreated on the node that was active, but
> the user instances (user vms) are not recreated.
>
> Checking the NFS host where the image of this VM is located, I noticed
> that using the "qemu-img info" command I cannot read the volume file of
> this instance (error: Failed to get shared "write" lock).
>
> Is there any way to execute unlock or even another parameter that makes
> kvm start a VM without locking the volume on the primary storage in NFS? (I
> tried to put the NFS storage in version 4, but it still had no effect)...
>
>
> regards,
>
> Murilo Moura
>
>
> On Wed, Apr 10, 2024 at 2:38 PM Guto Veronezi <[email protected]>
> wrote:
>
> > Hello Murilo,
> >
> > Complementing Swen's answer, if your host is still up and you can
> > manage it, then you could also put your host in maintenance mode in
> > ACS. This process will evacuate (migrate to another host) every VM
> > from the host (not only the ones that have HA enabled). Is this your
> > situation? If not, could you provide more details about your
> > configurations and the environment state?
> >
> > Depending on what you have in your setup, the HA might not work as
> > expected. For VMware and XenServer, the process is expected to happen
> > at the hypervisor level. For KVM, ACS does not support HA; what ACS
> > supports is failover (it is named HA in ACS though) and this process
> > will work only when certain criteria are met. Furthermore, we have two
> > ways to implement the failover for ACS + KVM: the VM's failover and
> > the host's failover. In both cases, when identified that a host
> > crashed or a VM suddenly stopped working, ACS will start the VM in
> another host.
> >
> > In ACS + KVM, to work with VM's failover, it is necessary at least one
> > NFS primary storage; the KVM Agent of every host writes the heartbeat
> > in it. The VM's failover is triggered only if the VM's compute
> > offering has the property "Offer HA" enabled OR the global setting
> > "force.ha" is enabled. VRs have failover triggered independently of
> > the offering of the global setting. In this approach, ACS will check
> > the VM state periodically (sending commands to the KVM Agent) and it
> > will trigger the failover if the VM meets the previously mentioned
> > criteria AND the determined limit (defined by the global settings
> > "ping.interval" and
> > "ping.timeout") has been elapsed. Bear in mind that, if you lose your
> > host, ACS will trigger the failover; however, if you gracefully
> > shutdown the KVM Agent or the host, the Agent will send a disconnect
> > command to the Management Server and ACS will not check the VM state
> > anymore for that host. Therefore, if you lose your host while the
> > service is down, the failover will not be triggered. Also, if a host
> > loses access to the NFS primary storage used for heartbeat and the VM
> > uses some other primary storage, ACS might trigger the failover too.
> > As we do not have a STONITH/fencing in this scenario, it is possible
> > for the VM to still be running in the host and ACS to try to start it in
> another host.
> >
> > In ACS + KVM, to work with the host's failover, it is necessary to
> > configure the host's OOBM (of each host desired to trigger the
> > failover) in ACS. In this approach, ACS monitors the Agent's state and
> > triggers the failover in case it cannot establish the connection
> > again. In this scenario, ACS will shut down the host via OOBM and will
> > start the VMs in another host; therefore, it is not dependent on an NFS
> primary storage.
> > This behavior is driven by the "kvm.ha.*" global settings.
> > Furthermore, one has to be aware that stopping the Agent might trigger
> > the failover; therefore, it is recommended to disable the failover
> > feature while doing operations in the host (like upgrading the
> > packages or some other maintenance processes).
> >
> > Best regards,
> > Daniel Salvador (gutoveronezi)
> >
> > On 10/04/2024 03:52, [email protected] wrote:
> > > What exactly do you mean? In which state is the host?
> > > If a host is in state "Disconnected" or "Alert" you can declare a
> > > host
> > as degraded via api (
> > https://cloudstack.apache.org/api/apidocs-4.19/apis/declareHostAsDegra
> > ded.html)
> > or UI (icon).
> > > Cloudstack will then start all VM with HA enabled on other hosts, if
> > storage is accessible.
> > >
> > > Regards,
> > > Swen
> > >
> > > -----Ursprüngliche Nachricht-----
> > > Von: Murilo Moura <[email protected]>
> > > Gesendet: Mittwoch, 10. April 2024 02:10
> > > An: [email protected]
> > > Betreff: Manual fence KVM Host
> > >
> > > hey guys!
> > >
> > > Is there any way to manually fence a KVM host and then automatically
> > start the migration of VMs that have HA enabled?
> > >
> > >
> >
>
>
>

Re: AW: Manual fence KVM Host

Reply via email to