> On 16 Sep 2016, at 16:34, aleksey.maksi...@it-kb.ru wrote:
> 
> Тested.
> 
> If I run 'shutdown -h now' on host with running HA VM (not HostedEngine VM)...
> 
> in oVirt web-console appears event:
> 
> Sep 16, 2016 5:13:18 PM VM KOM-AD01-PBX02 is down. Exit message: User shut 
> down from within the guest

that would be another bug. It should be recognized properly as a “kill”. Can 
you please share host logs from this attempt as well?

> 
> HA VM is turned off and will not start on another host.
> 
> This journald log from HA VM guest OS:
> 
> ...
> Sep 16 17:06:48 KOM-AD01-PBX02 python[2637]: [100B blob data]
> Sep 16 17:06:53 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.91.157:123 (ntp.ubuntu.com).
> Sep 16 17:07:03 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.89.199:123 (ntp.ubuntu.com).
> Sep 16 17:07:13 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.89.198:123 (ntp.ubuntu.com).
> Sep 16 17:07:23 KOM-AD01-PBX02 systemd-timesyncd[1739]: Timed out waiting for 
> reply from 91.189.94.4:123 (ntp.ubuntu.com).
> Sep 16 17:08:48 KOM-AD01-PBX02 python[2637]: [90B blob data]
> Sep 16 17:08:49 KOM-AD01-PBX02 python[2637]: [155B blob data]
> Sep 16 17:08:49 KOM-AD01-PBX02 python[2637]: [100B blob data]
> Sep 16 17:10:49 KOM-AD01-PBX02 python[2637]: [90B blob data]
> Sep 16 17:10:50 KOM-AD01-PBX02 python[2637]: [155B blob data]
> Sep 16 17:10:50 KOM-AD01-PBX02 python[2637]: [100B blob data]
> -- Reboot --
> ...
> 
> Before shutting down in the log no termination procedures.
> It looks like a rough poweroff the VM

yep, that is expected. But it should be properly detected as such and HE VM 
should restart. Somehow vdsm misidentifies the reason for the shutdown.

> 
> 16.09.2016, 17:08, "Simone Tiraboschi" <stira...@redhat.com>:
>> On Fri, Sep 16, 2016 at 4:02 PM, <aleksey.maksi...@it-kb.ru> wrote:
>>> So, colleagues.
>>> I again tested the Fencing and now I think that my host-server power-button 
>>> (physically or through ILO) sends a KILL-command to the host OS (and as a 
>>> result to VM)
>>> This journald log in my guest OS when I press the power-button on the host:
>>> 
>>> ...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping ACPI event daemon...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Stopping User Manager for UID 
>>> 1000...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 systemd[1]: Starting Unattended Upgrades 
>>> Shutdown...
>>> Sep 16 16:19:27 KOM-AD01-PBX02 snapd[2583]: 2016/09/16 16:19:27.289063 
>>> main.go:67: Exiting on terminated signal.
>>> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2940]: pam_unix(sshd:session): session 
>>> closed for user user
>>> Sep 16 16:19:27 KOM-AD01-PBX02 su[3015]: pam_unix(su:session): session 
>>> closed for user root
>>> Sep 16 16:19:27 KOM-AD01-PBX02 spice-vdagentd[2638]: vdagentd quiting, 
>>> returning status 0
>>> Sep 16 16:19:27 KOM-AD01-PBX02 sudo[3014]: pam_unix(sudo:session): session 
>>> closed for user root
>>> Sep 16 16:19:27 KOM-AD01-PBX02 /usr/lib/snapd/snapd[2583]: main.go:67: 
>>> Exiting on terminated signal.
>>> Sep 16 16:19:27 KOM-AD01-PBX02 sshd[2812]: Received signal 15; terminating.
>>> ...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Unmount All 
>>> Filesystems.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped target Local File 
>>> Systems (Pre).
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopping Monitoring of LVM2 
>>> mirrors, snapshots etc. using dmeventd or progress polling...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Remount Root and Kernel 
>>> File Systems.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Create Static Device 
>>> Nodes in /dev.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Shutdown.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Reached target Final Step.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Starting Reboot...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Stopped Monitoring of LVM2 
>>> mirrors, snapshots etc. using dmeventd or progress polling.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd[1]: Shutting down.
>>> Sep 16 16:19:28 KOM-AD01-PBX02 kernel: [drm:qxl_enc_commit [qxl]] *ERROR* 
>>> head number too large or missing monitors config: ffffc9000084a000, 
>>> 0systemd-shutdown[1]: Sending SIGTERM to remaining processes...
>>> Sep 16 16:19:28 KOM-AD01-PBX02 systemd-journald[3342]: Journal stopped
>>> -- Reboot --
>>> 
>>> Perhaps this feature of HP ProLiant DL 360 G5. I dont know.
>>> 
>>> If I test the unavailability of a host other ways that everything is going 
>>> well.
>>> 
>>> I described my experience testing Fencing on practical examples on my blog 
>>> for everyone in Russian.
>>> https://blog.it-kb.ru/2016/09/16/install-ovirt-4-0-part-4-about-ssh-soft-fencing-and-hard-fencing-over-hp-proliant-ilo2-power-managment-agent-and-test-of-high-availability/
>>> 
>>> Thank you all very much for your participation and support.
>>> 
>>> Michal, what kind of scenario are you talking about?
>> 
>> Basically what you just did,
>> the question is what happens when you run 'shutdown -h now' (or press the 
>> physical button if configured to trigger a soft shutdown); is it going to 
>> propagate somehow the shutdown action to the VMs or to brutally kill them?
>> 
>> In the first case the VMs will not restart regardless of their HA flags.
>> 
>>> PS: Excuse me for my bad English :)
>>> 
>>> 16.09.2016, 16:37, "Simone Tiraboschi" <stira...@redhat.com>:
>>>> On Fri, Sep 16, 2016 at 3:34 PM, Michal Skrivanek 
>>>> <michal.skriva...@redhat.com> wrote:
>>>>>> On 16 Sep 2016, at 15:31, aleksey.maksi...@it-kb.ru wrote:
>>>>>> 
>>>>>> Hi Simone.
>>>>>> Exactly.
>>>>>> Now I'll put the journald on the guest and try to understand how the 
>>>>>> guest off.
>>>>> 
>>>>> great. thanks
>>>>> 
>>>>>> 16.09.2016, 16:25, "Simone Tiraboschi" <stira...@redhat.com>:
>>>>>>> On Fri, Sep 16, 2016 at 3:13 PM, Michal Skrivanek 
>>>>>>> <michal.skriva...@redhat.com> wrote:
>>>>>>>>> On 16 Sep 2016, at 15:05, Gianluca Cecchi <gianluca.cec...@gmail.com> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> On Fri, Sep 16, 2016 at 2:50 PM, Michal Skrivanek 
>>>>>>>>> <michal.skriva...@redhat.com> wrote:
>>>>>>>>>> no, that’s not how HA works today. When you log into a guest and 
>>>>>>>>>> issue “shutdown” we do not restart the VM under your hands. We can 
>>>>>>>>>> argue how it should or may work, but this is the defined behavior 
>>>>>>>>>> since the dawn of oVirt.
>>>>>>>>>> 
>>>>>>>>>>> ​AFAIK that's correct, we need to be able ​
>>>>>>>>>>> ​shutdown HA VM​
>>>>>>>>>>> ​
>>>>>>>>>>> ​ without being it immediately restarted on different host. We want 
>>>>>>>>>>> to restart HA VM only if host, where HA VM is running, is 
>>>>>>>>>>> non-responsive.
>>>>>>>>>> 
>>>>>>>>>> we try to restart it in all other cases other than user initiated 
>>>>>>>>>> shutdown, e.g. a QEMU process crash on an otherwise-healthy host
>>>>>>>>> Hi, just another question in case HA is not configured at all.
>>>>>>>> 
>>>>>>>> by “HA configured” I expect you’re referring to the “Highly Available” 
>>>>>>>> checkbox in Edit VM dialog.
>>>>>>>> 
>>>>>>>>> If I run the "shutdown -h now" command on an host where some VMs are 
>>>>>>>>> running, what is the expected behavior?
>>>>>>>>> Clean VM shutdown (with or without timeout in case it doesn't 
>>>>>>>>> complete?) or crash of their related QEMU processes?
>>>>>>>> 
>>>>>>>> expectation is that you won’t do that. That’s why there is the 
>>>>>>>> Maintenance host state.
>>>>>>>> But if you do that regardless, with VMs running, all the processes 
>>>>>>>> will be terminated in a regular system way, i.e. all QEMU processes 
>>>>>>>> get SIGTERM. From the perspective of each guest this is not a clean 
>>>>>>>> shutdown and it would just get killed
>>>>>>> 
>>>>>>> Aleksey is reporting that he started a shutdown on his host by power 
>>>>>>> management and the VM processes didn't get roughly killed but smoothly 
>>>>>>> shut down and so they didn't restarted regardless of their HA flag and 
>>>>>>> so this thread.
>>>>> 
>>>>> Gianluca talks about “shutdown -h now”, you talk about power management 
>>>>> action, those are two different things. The current idea is that systemd 
>>>>> or some other component just propagates the action to the guest and if 
>>>>> that guest is configured to handle it as a shutdown it starts it itself 
>>>>> as well so it looks like a user-initiated one. Even though this mostly 
>>>>> makes sense it is not ok for current HA logic
>>>> 
>>>> Aleksey, can you please also test this scenario?
>>>>>>>> Thanks,
>>>>>>>> michal
>>>>>>>>> Thanks,
>>>>>>>>> Gianluca
>>>>>>>>> _______________________________________________
>>>>>>>>> Users mailing list
>>>>>>>>> Users@ovirt.org
>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list
>>>>>>>> Users@ovirt.org
>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>> _______________________________________________
>>>>>> Users mailing list
>>>>>> Users@ovirt.org
>>>>>> http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to