I'm encountering a kernel BUG() in guests using SCSI-interfaced disk images. I've tried with the Debian packaging of KVM 79 and 82; both exhibit the same behavior (disclaimer: Debian has about a dozen patches in their kvm packaging, but they all seem to be changes to the build/install process or security-related).
IDE-interfaced disk images seem fine. Host and guest are up-to-date Debian lenny (32-bit/i386) running kernel 2.6.26 (Debian linux-image-2.6.26-1-amd64 2.6.26-12). After a few minutes of disk activity (fsck(8)ing a fairly empty ~20GB filesystem is a reliable trigger), the kernel BUGs (oops output below). I was previously using KVM 72, and tried upgrading to 79 because both Debian lenny and Ubuntu hardy guests were panicing due to sym disconnects/timeouts. 79 makes the lenny guest start BUGging as described above. 82 is not perceivably different from 79 for the lenny guest. FWIW, the upgrade to 79 allowed the Ubuntu hardy guest to stay up, although it emits: Dec 25 00:28:51 vicar kernel: [106621.553272] sd 2:0:0:0: [sda] Sense Key : No Sense [current] Dec 25 00:28:51 vicar kernel: [106621.553279] Info fld=0x0 Dec 25 00:28:51 vicar kernel: [106621.553280] sd 2:0:0:0: [sda] Add. Sense: No additional sense information at seemingly random intervals. The upgrade to 82 made the hardy guest start BUGging on soft lockups at random intervals (I can provide the full output if anyone's interested, but I'm much more interested in the lenny guest oops at this point). john run via libvirt: /usr/bin/kvm -S -M pc -m 512 -smp 1 -name test -monitor pty \ -boot c -drive file=image.qcow,if=scsi,index=0,boot=on -net nic,macaddr=00:0c:29:1e:ea:b9,vlan=0,model=e1000 \ -net tap,fd=17,script=,vlan=0,ifname=vnet2 \ -net nic,macaddr=00:0c:29:1e:ea:c3,vlan=1,model=e1000 \ -net tap,fd=18,script=,vlan=1,ifname=vnet3 \ -serial pty -parallel none -usb -vnc 0.0.0.0:1 [The KVMWiki asks whether the problem is reproducible with -no-kvm-irqchip, -no-kvm-pit, or -no-kvm, but when I tried invoking the above command line by hand (outside of libvirt), the VNC console was always blank and there was no console output on the serial pty. If this would be useful information to have in this case, I'd love to know what I'm doing wrong, or if there's a way to specify additional command line arguments with libvirt.] oops generated in the guest: [ 140.101828] sym0: unexpected disconnect [ 140.102748] BUG: unable to handle kernel NULL pointer dereference at 00000358 [ 140.103818] IP: [<e08e2670>] :sym53c8xx:sym_int_sir+0x547/0x118f [ 140.106449] *pdpt = 000000001f5f9001 *pde = 0000000000000000 [ 140.107356] Oops: 0000 [#1] SMP [ 140.107864] Modules linked in: loop virtio_balloon psmouse pcspkr serio_raw i2c_piix4 i2c_core button evdev ext3 jbd mbcache sd_mod ide_cd_mod cdrom ata_generic libata dock ide_pci_generic floppy virtio_pci virtio_ring virtio sym53c8xx scsi_transport_spi scsi_mod e1000 uhci_hcd usbcore piix ide_core thermal processor fan thermal_sys [ 140.108062] [ 140.108062] Pid: 131, comm: pdflush Not tainted (2.6.26-1-686-bigmem #1) [ 140.108062] EIP: 0060:[<e08e2670>] EFLAGS: 00010287 CPU: 0 [ 140.108062] EIP is at sym_int_sir+0x547/0x118f [sym53c8xx] [ 140.108062] EAX: 0000000a EBX: 00000000 ECX: 1f98c084 EDX: 00000030 [ 140.108062] ESI: df98c084 EDI: df98c000 EBP: df98c000 ESP: de0f3ba0 [ 140.108062] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 140.108062] Process pdflush (pid: 131, ti=de0f2000 task=df48e520 task.ti=de0f2000) [ 140.108062] Stack: 00000000 000144d6 7f5a222c c011a853 0021d496 00000000 00000000 00000000 [ 140.108062] 00000000 df98c000 e08e08cd 00000000 00000000 00000001 00000000 df98c000 [ 140.108062] 00000084 e08e3f2f df988c00 00000046 00000000 df544400 00000196 00000000 [ 140.108062] Call Trace: [ 140.108062] [<c011a853>] pvclock_clocksource_read+0x4b/0xd0 [ 140.108062] [<e08e08cd>] sym_recover_scsi_int+0xb3/0x10d [sym53c8xx] [ 140.108062] [<e08e3f2f>] sym_interrupt+0x3ee/0x5fd [sym53c8xx] [ 140.108062] [<e08df3dc>] sym53c8xx_intr+0x35/0x56 [sym53c8xx] [ 140.108062] [<c0158e4e>] handle_IRQ_event+0x23/0x51 [ 140.108062] [<c0159f4d>] handle_fasteoi_irq+0x71/0xa4 [ 140.108062] [<c010afd2>] do_IRQ+0x4d/0x63 [ 140.108062] [<c01092a7>] common_interrupt+0x23/0x28 [ 140.108062] [<c01300d8>] ptrace_request+0x1ec/0x278 [ 140.108062] [<c012d0c6>] __do_softirq+0x57/0xd3 [ 140.108062] [<c012d187>] do_softirq+0x45/0x53 [ 140.108062] [<c012d43e>] irq_exit+0x35/0x67 [ 140.108062] [<c01152b6>] smp_apic_timer_interrupt+0x6b/0x75 [ 140.108062] [<c0109364>] apic_timer_interrupt+0x28/0x30 [ 140.108062] [<c02c9953>] _spin_unlock_irqrestore+0x7/0x10 [ 140.108062] [<e0865a94>] scsi_dispatch_cmd+0x197/0x205 [scsi_mod] [ 140.108062] [<e086ab2e>] scsi_request_fn+0x264/0x32a [scsi_mod] [ 140.108063] [<c01dcbd6>] __generic_unplug_device+0x1a/0x1c [ 140.108063] [<c01dd3e9>] __make_request+0x2fe/0x348 [ 140.108063] [<c01dc008>] generic_make_request+0x34d/0x37b [ 140.108063] [<c015f9f1>] mempool_alloc+0x1c/0xba [ 140.108063] [<c01dd0e4>] submit_bio+0xc6/0xcd [ 140.108063] [<c019cdff>] bio_alloc_bioset+0x9b/0xf3 [ 140.108063] [<c0199983>] submit_bh+0xcf/0xed [ 140.108063] [<c019b32e>] __block_write_full_page+0x1fa/0x2da [ 140.108063] [<c019eb73>] blkdev_get_block+0x0/0x43 [ 140.108063] [<c019b4ef>] block_write_full_page+0xe1/0xea [ 140.108063] [<c019eb73>] blkdev_get_block+0x0/0x43 [ 140.108063] [<c01626d5>] __writepage+0x8/0x21 [ 140.108063] [<c0162b50>] write_cache_pages+0x16a/0x27b [ 140.108063] [<c01626cd>] __writepage+0x0/0x21 [ 140.108063] [<c0162c61>] generic_writepages+0x0/0x21 [ 140.108063] [<c0162c7b>] generic_writepages+0x1a/0x21 [ 140.108063] [<c0162ca2>] do_writepages+0x20/0x30 [ 140.108063] [<c0196525>] __writeback_single_inode+0x127/0x251 [ 140.108063] [<c019691c>] sync_sb_inodes+0x17c/0x233 [ 140.108063] [<c0196c93>] writeback_inodes+0x53/0x99 [ 140.108063] [<c01638c1>] pdflush+0x0/0x1cc [ 140.108063] [<c016357c>] wb_kupdate+0x7b/0xdb [ 140.108063] [<c01639f0>] pdflush+0x12f/0x1cc [ 140.108063] [<c0163501>] wb_kupdate+0x0/0xdb [ 140.108063] [<c0138643>] kthread+0x38/0x5d [ 140.108063] [<c013860b>] kthread+0x0/0x5d [ 140.108063] [<c01094f3>] kernel_thread_helper+0x7/0x10 [ 140.108063] ======================= [ 140.108063] Code: 93 4c 01 00 00 52 50 68 42 76 8e e0 eb 4e 8d 83 b0 00 00 00 e8 32 71 96 df 8d 93 4c 01 00 00 52 50 68 7c 76 8e e0 eb 59 8b 1c 24 <8b> 93 58 03 00 00 8b 82 84 00 00 00 8b 1a 8b 70 60 85 f6 74 29 [ 140.108063] EIP: [<e08e2670>] sym_int_sir+0x547/0x118f [sym53c8xx] SS:ESP 0068:de0f3ba0 [ 140.162446] Kernel panic - not syncing: Fatal exception in interrupt vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU L5420 @ 2.50GHz stepping : 6 cpu MHz : 2500.087 cache size : 6144 KB [...] wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca sse4_1 lahf_lm bogomips : 5000.23 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: -- John Morrissey _o /\ ---- __o j...@horde.net _-< \_ / \ ---- < \, www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__ -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html