Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

aleksey . maksimov Fri, 16 Sep 2016 07:36:16 -0700

"your VM would be killed uncleanly."

This is not a good idea, I think



16.09.2016, 17:14, "Michal Skrivanek" <michal.skriva...@redhat.com>:
>>  On 16 Sep 2016, at 16:02, aleksey.maksi...@it-kb.ru wrote:
>>
>>  So, colleagues.
>>  I again tested the Fencing and now I think that my host-server power-button 
>> (physically or through ILO) sends a KILL-command to the host OS (and as a 
>> result to VM)
>
> thanks for confirmation, then it is indeed 
> https://bugzilla.redhat.com/show_bug.cgi?id=1341106
>
> I’m not sure if there is any good workaround. You can always 
> reconfigure(disable) ACPI in the guest, then HA logic would work ok but it 
> also means there is no graceful shutdown and your VM would be killed 
> uncleanly.
>
>>  This journald log in my guest OS when I press the power-button on the host:
>>
>>  ..
>>  Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
>>  Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 
>> 1000...
>>  Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
>> Shutdown...
>>  Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
>> main.go:67: Exiting on terminated signal.
>>  Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
>> closed for user user
>>  Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session 
>> closed for user root
>>  Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
>> returning status 0
>>  Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
>> closed for user root
>>  Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: 
>> Exiting on terminated signal.
>>  Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
>>  ..
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
>> Filesystems.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File 
>> Systems (Pre).
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 
>> mirrors, snapshots etc. using dmeventd or progress polling...
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel 
>> File Systems.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device 
>> Nodes in /dev.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 
>> mirrors, snapshots etc. using dmeventd or progress polling.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
>>  Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* 
>> head number too large or missing monitors config: ffffc9000084a000, 
>> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
>>  Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
>>  -- Reboot --
>>
>>  Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
>>
>>  If I test the unavailability of a host other ways that everything is going 
>> well.
>>
>>  I described my experience testing Fencing on practical examples on my blog 
>> for everyone in Russian.
>>  
>> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/
>>
>>  Thank you all very much for your participation and support.
>>
>>  Michal, what kind of scenario are you talking about?
>>
>>  PS: Excuse me for my bad English :)
>>
>>  16.09.2016, 16:37, "Simone Tiraboschi" <stira...@redhat.com>:
>>>  On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek 
>>> <michal.skriva...@redhat.com> wrote:
>>>>>  On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
>>>>>
>>>>>  Hi Simone.
>>>>>  Exactly.
>>>>>  Now I'll put the journald on the guest and try to understand how the 
>>>>> guest off.
>>>>
>>>>  great. thanks
>>>>
>>>>>  16.09.2016, 16:25, "Simone Tiraboschi" <stira...@redhat.com>:
>>>>>>  On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek 
>>>>>> <michal.skriva...@redhat.com> wrote:
>>>>>>>>  On 16 Sep 2016, at 15:05, Gianluca Cecchi <gianluca.cec...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>  On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
>>>>>>>> <michal.skriva...@redhat.com> wrote:
>>>>>>>>>  no, that’s not how HA works today. When you log into a guest and 
>>>>>>>>> issue “shutdown” we do not restart the VM under your hands. We can 
>>>>>>>>> argue how it should or may work, but this is the defined behavior 
>>>>>>>>> since the dawn of oVirt.
>>>>>>>>>
>>>>>>>>>>  AFAIK that's correct, we need to be able 
>>>>>>>>>>  shutdown HA VM
>>>>>>>>>>  
>>>>>>>>>>   without being it immediately restarted on different host. We want 
>>>>>>>>>> to restart HA VM only if host, where HA VM is running, is 
>>>>>>>>>> non-responsive.
>>>>>>>>>
>>>>>>>>>  we try to restart it in all other cases other than user initiated 
>>>>>>>>> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>>>>>>>>  Hi, just another question in case HA is not configured at all.
>>>>>>>
>>>>>>>  by “HA configured” I expect you’re referring to the “Highly Available” 
>>>>>>> checkbox in Edit VM dialog.
>>>>>>>
>>>>>>>>  If I run the "shutdown -h now" command on an host where some VMs are 
>>>>>>>> running, what is the expected behavior?
>>>>>>>>  Clean VM shutdown (with or without timeout in case it doesn't 
>>>>>>>> complete?) or crash of their related QEMU processes?
>>>>>>>
>>>>>>>  expectation is that you won’t do that. That’s why there is the 
>>>>>>> Maintenance host state.
>>>>>>>  But if you do that regardless, with VMs running, all the processes 
>>>>>>> will be terminated in a regular system way, i.e. all QEMU processes get 
>>>>>>> SIGTERM. From the perspective of each guest this is not a clean 
>>>>>>> shutdown and it would just get killed
>>>>>>
>>>>>>  Aleksey is reporting that he started a shutdown on his host by power 
>>>>>> management and the VM processes didn't get roughly killed but smoothly 
>>>>>> shut down and so they didn't restarted regardless of their HA flag and 
>>>>>> so this thread.
>>>>
>>>>  Gianluca talks about “shutdown -h now”, you talk about power management 
>>>> action, those are two different things. The current idea is that systemd 
>>>> or some other component just propagates the action to the guest and if 
>>>> that guest is configured to handle it as a shutdown it starts it itself as 
>>>> well so it looks like a user-initiated one. Even though this mostly makes 
>>>> sense it is not ok for current HA logic
>>>
>>>  Aleksey, can you please also test this scenario?
>>>>>>>  Thanks,
>>>>>>>  michal
>>>>>>>>  Thanks,
>>>>>>>>  Gianluca
>>>>>>>>  _______________________________________________
>>>>>>>>  Users mailing list
>>>>>>>>  Users@ovirt.org
>>>>>>>>  http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>
>>>>>>>  _______________________________________________
>>>>>>>  Users mailing list
>>>>>>>  Users@ovirt.org
>>>>>>>  http://lists.ovirt.org/mailman/listinfo/users
>>>>>  _______________________________________________
>>>>>  Users mailing list
>>>>>  Users@ovirt.org
>>>>>  http://lists.ovirt.org/mailman/listinfo/users
>>  _______________________________________________
>>  Users mailing list
>>  Users@ovirt.org
>>  http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 4.0.3 (Hosted Engine) - High Availability VM not restart after auto-fencing of host.

Reply via email to