OK, I narrowed it down. It happens to all VMs running Windows Server 2012 R2.

Does anybody else have problems with the stability of Windows Server 2012 R2 in 
oVirt? The VM is just crashing, oVirt isn’t able to contact the VM (status “Not 
Responding”) and I have to kill the qemu-kvm process of this specific VM to be 
able to start it again.

 

I already changed the NIC from e1000 to VirtIO, but the VMs keep crashing.

 

Best regards, Christian

 

Von: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] Im Auftrag von 
Christian Hailer
Gesendet: Mittwoch, 9. September 2015 13:24
An: users@ovirt.org
Betreff: Re: [ovirt-users] Some VMs in status "not responding" in oVirt 
interface

 

Hello,

 

unfortunately I still have this problem… 

Last week I checked all the hardware components. It’s a HP DL580 Gen8 Server, 
128GB RAM, 4TB storage.

The firmware of all components is up to date.

I ran a full check of all harddrives, CPUs etc., no problems detected.

 

This night 3 VMs stopped responding again, so I had to reboot the server this 
morning to regain access. Some minutes ago 2 VMs stopped responding…

 

The logs just show that the VMs aren’t responding anymore, nothing else… does 
anybody have an idea how I can debug this issue any further?

 

OS: CentOS Linux release 7.1.1503

 

>rpm -qa|grep ovirt

ovirt-iso-uploader-3.5.2-1.el7.centos.noarch

ovirt-engine-setup-3.5.4.2-1.el7.centos.noarch

ovirt-guest-tools-iso-3.5-7.noarch

ovirt-log-collector-3.5.4-2.el7.centos.noarch

ovirt-engine-userportal-3.5.4.2-1.el7.centos.noarch

ovirt-engine-cli-3.5.0.6-1.el7.centos.noarch

ovirt-engine-tools-3.5.4.2-1.el7.centos.noarch

ovirt-release35-005-1.noarch

ovirt-engine-lib-3.5.4.2-1.el7.centos.noarch

ovirt-engine-setup-plugin-ovirt-engine-common-3.5.4.2-1.el7.centos.noarch

ovirt-host-deploy-java-1.3.2-1.el7.centos.noarch

ovirt-engine-extensions-api-impl-3.5.4.2-1.el7.centos.noarch

ovirt-engine-webadmin-portal-3.5.4.2-1.el7.centos.noarch

ovirt-engine-restapi-3.5.4.2-1.el7.centos.noarch

ovirt-engine-setup-base-3.5.4.2-1.el7.centos.noarch

ovirt-engine-backend-3.5.4.2-1.el7.centos.noarch

ovirt-engine-setup-plugin-websocket-proxy-3.5.4.2-1.el7.centos.noarch

ovirt-host-deploy-1.3.2-1.el7.centos.noarch

ovirt-engine-websocket-proxy-3.5.4.2-1.el7.centos.noarch

ovirt-engine-dbscripts-3.5.4.2-1.el7.centos.noarch

ovirt-engine-jboss-as-7.1.1-1.el7.x86_64

ovirt-engine-sdk-python-3.5.4.0-1.el7.centos.noarch

ovirt-engine-setup-plugin-ovirt-engine-3.5.4.2-1.el7.centos.noarch

ovirt-image-uploader-3.5.1-1.el7.centos.noarch

ovirt-engine-3.5.4.2-1.el7.centos.noarch

 

>rpm -qa|grep vdsm

vdsm-python-4.16.26-0.el7.centos.noarch

vdsm-jsonrpc-java-1.0.15-1.el7.noarch

vdsm-jsonrpc-4.16.26-0.el7.centos.noarch

vdsm-yajsonrpc-4.16.26-0.el7.centos.noarch

vdsm-xmlrpc-4.16.26-0.el7.centos.noarch

vdsm-cli-4.16.26-0.el7.centos.noarch

vdsm-4.16.26-0.el7.centos.x86_64

vdsm-python-zombiereaper-4.16.26-0.el7.centos.noarch

 

>rpm -qa|grep kvm

qemu-kvm-ev-2.1.2-23.el7_1.8.1.x86_64

qemu-kvm-common-ev-2.1.2-23.el7_1.8.1.x86_64

libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64

qemu-kvm-tools-ev-2.1.2-23.el7_1.8.1.x86_64

 

>uname -a 

Linux ovirt 3.10.0-229.11.1.el7.x86_64 #1 SMP Thu Aug 6 01:06:18 UTC 2015 
x86_64 x86_64 x86_64 GNU/Linux

 

Any feedback is much appreciated!!

 

Best regards, Christian

 

Von:  <mailto:users-boun...@ovirt.org> users-boun...@ovirt.org [ 
<mailto:users-boun...@ovirt.org> mailto:users-boun...@ovirt.org] Im Auftrag von 
Christian Hailer
Gesendet: Samstag, 29. August 2015 22:48
An:  <mailto:users@ovirt.org> users@ovirt.org
Betreff: [ovirt-users] Some VMs in status "not responding" in oVirt interface

 

Hello,

 

last Wednesday I wanted to update my oVirt 3.5 hypervisor. It is a single 
Centos 7 server, so I started by suspending the VMs in order to set the oVirt 
engine host to maintenance mode. During the process of suspending the VMs the 
server crashed, kernel panic…

After restarting the server I installed the updates via yum an restarted the 
server again. Afterwards, all the VMs could be started again. Some hours later 
my monitoring system registered some unresponsive hosts, I had a look in the 
oVirt interface, 3 of the VMs were in the state “not responding”, marked by a 
question mark. 

I tried to shut down the VMs, but oVirt wasn’t able to do so. I tried to reset 
the status in the database with the sql statement

 

update vm_dynamic set status = 0 where vm_guid = (select vm_guid from vm_static 
where vm_name = 'MYVMNAME');

 

but that didn’t help, either. Only rebooting the whole hypervisor helped… 
afterwards everything worked again. But only for a few hours, then one of the 
VMs entered the “not responding” state again… again only a reboot helped. 
Yesterday it happened again:

 

2015-08-28 17:44:22,664 INFO  
[org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] 
(DefaultQuartzScheduler_Worker-60) [4ef90b12] VM DC 
0f3d1f06-e516-48ce-aa6f-7273c33d3491 moved from Up --> NotResponding

2015-08-28 17:44:22,692 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler_Worker-60) [4ef90b12] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM DC is not responding.

 

Does anybody know what I can do? Where should I have a look? Hints are greatly 
appreciated!

 

Thanks,

Christian

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to