Just thought I'd share my experiences of updating a KVM host and guests this morning. I'll acknowledge up front that I didn't do things in the right order so the mistakes were mine.

Start: RHEL6.1 KVM host, x2 RHEL6.1 guests using .img files (LVM partitions inside). Fully up to date as of just before the RHEL6.2 errata release.

I did "yum clean all ; yum update" on both the host and the guests at the same time (yeah, I know). In my defence, a seemingly identical setup I did this on yesterday worked without issues.

At the point at which the host was completing its cleanup this happened in /var/log/messages:

Dec  8 07:14:47 frazil libvirtd: 07:14:47.926: 14778: warning : 
qemudDispatchSignalEvent:403 : Shutting down on signal 15
Dec  8 07:14:49 frazil yum[1235]: Updated: libvirt-0.9.4-23.el6_2.1.x86_64

and further down

 Dec  8 07:15:00 frazil kernel: br1: port 2(vnet1) entering disabled state
 Dec  8 07:15:00 frazil kernel: device vnet1 left promiscuous mode
 Dec  8 07:15:00 frazil kernel: br1: port 2(vnet1) entering disabled state
 Dec  8 07:15:02 frazil ntpd[2194]: Deleting interface #23 vnet1, 
fe80::fc54:ff:fe01:6b3b#123, interface stats: received=0, sent=0, dropped=0, 
active_time=7241352 secs
 Dec  8 07:15:05 frazil kernel: br0: port 2(vnet0) entering disabled state
 Dec  8 07:15:05 frazil kernel: device vnet0 left promiscuous mode
 Dec  8 07:15:05 frazil kernel: br0: port 2(vnet0) entering disabled state
 Dec  8 07:15:07 frazil ntpd[2194]: Deleting interface #25 vnet0, 
fe80::fc54:ff:fe49:fae6#123, interface stats: received=0, sent=0, dropped=0, 
active_time=7238050 secs

At this point I lost connection to the guests, which (according to the SSH connections I had open to them) had apparently finished cleaning up after the yum update (according to the right-hand side X/Y counter) but hadn't returned a prompt yet so were obviously still busy doing stuff.

I guess the restart of the libvirtd service dropped the guests (except the same lines appear in the messages file of the server on which the guests didn't get killed).

Given I was rebooting the host anyway I didn't bother to bring the guests back up again and rebooted the host (yeah, I know). On reboot neither of the guests autostarted, so I logged in to the host and tried to start them with "virsh start <domain>". Both complained that

 error: internal error unable to reserve PCI address 0:0:2.0

and didn't start.  Checking the .xml files for both guests I noted that

 <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>

was listed for the 'disk' device. I also noticed that the following lines were missing

 <input type='mouse' bus='ps2'/>
 <graphics type='vnc' port='5901' autoport='no'/>
 <video>
   <model type='cirrus' vram='9216' heads='1'/>
   <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
 </video>

whereas they were in place for the KVM setup/host and guests which had successfully updated. I added in the lines, made the 'disk' PCI ID something else and after restarting libvirtd tried booting the guests again. Still no joy. Still the same error. In the end I commented out the "address type='pci'" line for 'video' and attempted to boot again. This time I got failures booting the newly installed kernel at the point at which the root LVM mount was attempted. It recommended I look at the "root=" part of the boot line, but didn't give me suggestions as to what to put there.

At this point I tried mounting the guests' disk images to see if the update of the kernel hadn't worked fully and the grub.conf was in a mess:

 # losetup /dev/loop0 foo.img
 # kpartx -av /dev/loop0
 # mount /dev/mapper/loop0p1 /mnt
 ...
 # umount /mnt
 # kpartx -dv /dev/loop0
 # losetup -d /dev/loop0

Once inside the image I looked at the grub.conf files and couldn't see any issues. I umounted the image and tried booting into an older kernel and the guests booted successfully. "yum update" indicated an incomplete transaction so I ran "yum-complete-transaction" and then "yum update kernel" and rebooted both guests successfully into the new kernel. All now seems well. Phew.

My questions are:

1) Is it a bad idea to patch the host's libvirtd while guests are running?
2) Should libvirtd have killed the guests like that?
3) With this update to KVM/qemu/libvird are "address type='pci'" now unnecessary and removable from /etc/libvirt/qemu/<domain>.xml files as PCI IDs are now dynamically assigned?

Ben
--
Unix Support, MISD, University of Cambridge, England
Plugger of wire, typer of keyboard, imparter of Clue
        Life Is Short.          It's All Good.

_______________________________________________
rhelv6-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/rhelv6-list

Reply via email to