I'm encountering a kernel BUG() in guests using SCSI-interfaced disk images.
I've tried with the Debian packaging of KVM 79 and 82; both exhibit the same
behavior (disclaimer: Debian has about a dozen patches in their kvm
packaging, but they all seem to be changes to the build/install process or
security-related).

IDE-interfaced disk images seem fine. Host and guest are up-to-date Debian
lenny (32-bit/i386) running kernel 2.6.26 (Debian linux-image-2.6.26-1-amd64
2.6.26-12).

After a few minutes of disk activity (fsck(8)ing a fairly empty ~20GB
filesystem is a reliable trigger), the kernel BUGs (oops output below).

I was previously using KVM 72, and tried upgrading to 79 because both Debian
lenny and Ubuntu hardy guests were panicing due to sym disconnects/timeouts.
79 makes the lenny guest start BUGging as described above. 82 is not
perceivably different from 79 for the lenny guest.

FWIW, the upgrade to 79 allowed the Ubuntu hardy guest to stay up, although
it emits:

Dec 25 00:28:51 vicar kernel: [106621.553272] sd 2:0:0:0: [sda] Sense Key : No 
Sense [current] 
Dec 25 00:28:51 vicar kernel: [106621.553279] Info fld=0x0
Dec 25 00:28:51 vicar kernel: [106621.553280] sd 2:0:0:0: [sda] Add. Sense: No 
additional sense information

at seemingly random intervals. The upgrade to 82 made the hardy guest start
BUGging on soft lockups at random intervals (I can provide the full output
if anyone's interested, but I'm much more interested in the lenny guest
oops at this point).

john


run via libvirt:
/usr/bin/kvm -S -M pc -m 512 -smp 1 -name test -monitor pty \
        -boot c -drive file=image.qcow,if=scsi,index=0,boot=on
        -net nic,macaddr=00:0c:29:1e:ea:b9,vlan=0,model=e1000 \
        -net tap,fd=17,script=,vlan=0,ifname=vnet2 \
        -net nic,macaddr=00:0c:29:1e:ea:c3,vlan=1,model=e1000 \
        -net tap,fd=18,script=,vlan=1,ifname=vnet3 \
        -serial pty -parallel none -usb -vnc 0.0.0.0:1

[The KVMWiki asks whether the problem is reproducible with -no-kvm-irqchip,
 -no-kvm-pit, or -no-kvm, but when I tried invoking the above command line
 by hand (outside of libvirt), the VNC console was always blank and there
 was no console output on the serial pty. If this would be useful
 information to have in this case, I'd love to know what I'm doing wrong, or
 if there's a way to specify additional command line arguments with
 libvirt.]

oops generated in the guest:
[  140.101828] sym0: unexpected disconnect
[  140.102748] BUG: unable to handle kernel NULL pointer dereference at 00000358
[  140.103818] IP: [<e08e2670>] :sym53c8xx:sym_int_sir+0x547/0x118f
[  140.106449] *pdpt = 000000001f5f9001 *pde = 0000000000000000 
[  140.107356] Oops: 0000 [#1] SMP 
[  140.107864] Modules linked in: loop virtio_balloon psmouse pcspkr serio_raw 
i2c_piix4 i2c_core button evdev ext3 jbd mbcache sd_mod ide_cd_mod cdrom 
ata_generic libata dock ide_pci_generic floppy virtio_pci virtio_ring virtio 
sym53c8xx scsi_transport_spi scsi_mod e1000 uhci_hcd usbcore piix ide_core 
thermal processor fan thermal_sys
[  140.108062] 
[  140.108062] Pid: 131, comm: pdflush Not tainted (2.6.26-1-686-bigmem #1)
[  140.108062] EIP: 0060:[<e08e2670>] EFLAGS: 00010287 CPU: 0
[  140.108062] EIP is at sym_int_sir+0x547/0x118f [sym53c8xx]
[  140.108062] EAX: 0000000a EBX: 00000000 ECX: 1f98c084 EDX: 00000030
[  140.108062] ESI: df98c084 EDI: df98c000 EBP: df98c000 ESP: de0f3ba0
[  140.108062]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  140.108062] Process pdflush (pid: 131, ti=de0f2000 task=df48e520 
task.ti=de0f2000)
[  140.108062] Stack: 00000000 000144d6 7f5a222c c011a853 0021d496 00000000 
00000000 00000000 
[  140.108062]        00000000 df98c000 e08e08cd 00000000 00000000 00000001 
00000000 df98c000 
[  140.108062]        00000084 e08e3f2f df988c00 00000046 00000000 df544400 
00000196 00000000 
[  140.108062] Call Trace:
[  140.108062]  [<c011a853>] pvclock_clocksource_read+0x4b/0xd0
[  140.108062]  [<e08e08cd>] sym_recover_scsi_int+0xb3/0x10d [sym53c8xx]
[  140.108062]  [<e08e3f2f>] sym_interrupt+0x3ee/0x5fd [sym53c8xx]
[  140.108062]  [<e08df3dc>] sym53c8xx_intr+0x35/0x56 [sym53c8xx]
[  140.108062]  [<c0158e4e>] handle_IRQ_event+0x23/0x51
[  140.108062]  [<c0159f4d>] handle_fasteoi_irq+0x71/0xa4
[  140.108062]  [<c010afd2>] do_IRQ+0x4d/0x63
[  140.108062]  [<c01092a7>] common_interrupt+0x23/0x28
[  140.108062]  [<c01300d8>] ptrace_request+0x1ec/0x278
[  140.108062]  [<c012d0c6>] __do_softirq+0x57/0xd3
[  140.108062]  [<c012d187>] do_softirq+0x45/0x53
[  140.108062]  [<c012d43e>] irq_exit+0x35/0x67
[  140.108062]  [<c01152b6>] smp_apic_timer_interrupt+0x6b/0x75
[  140.108062]  [<c0109364>] apic_timer_interrupt+0x28/0x30
[  140.108062]  [<c02c9953>] _spin_unlock_irqrestore+0x7/0x10
[  140.108062]  [<e0865a94>] scsi_dispatch_cmd+0x197/0x205 [scsi_mod]
[  140.108062]  [<e086ab2e>] scsi_request_fn+0x264/0x32a [scsi_mod]
[  140.108063]  [<c01dcbd6>] __generic_unplug_device+0x1a/0x1c
[  140.108063]  [<c01dd3e9>] __make_request+0x2fe/0x348
[  140.108063]  [<c01dc008>] generic_make_request+0x34d/0x37b
[  140.108063]  [<c015f9f1>] mempool_alloc+0x1c/0xba
[  140.108063]  [<c01dd0e4>] submit_bio+0xc6/0xcd
[  140.108063]  [<c019cdff>] bio_alloc_bioset+0x9b/0xf3
[  140.108063]  [<c0199983>] submit_bh+0xcf/0xed
[  140.108063]  [<c019b32e>] __block_write_full_page+0x1fa/0x2da
[  140.108063]  [<c019eb73>] blkdev_get_block+0x0/0x43
[  140.108063]  [<c019b4ef>] block_write_full_page+0xe1/0xea
[  140.108063]  [<c019eb73>] blkdev_get_block+0x0/0x43
[  140.108063]  [<c01626d5>] __writepage+0x8/0x21
[  140.108063]  [<c0162b50>] write_cache_pages+0x16a/0x27b
[  140.108063]  [<c01626cd>] __writepage+0x0/0x21
[  140.108063]  [<c0162c61>] generic_writepages+0x0/0x21
[  140.108063]  [<c0162c7b>] generic_writepages+0x1a/0x21
[  140.108063]  [<c0162ca2>] do_writepages+0x20/0x30
[  140.108063]  [<c0196525>] __writeback_single_inode+0x127/0x251
[  140.108063]  [<c019691c>] sync_sb_inodes+0x17c/0x233
[  140.108063]  [<c0196c93>] writeback_inodes+0x53/0x99
[  140.108063]  [<c01638c1>] pdflush+0x0/0x1cc
[  140.108063]  [<c016357c>] wb_kupdate+0x7b/0xdb
[  140.108063]  [<c01639f0>] pdflush+0x12f/0x1cc
[  140.108063]  [<c0163501>] wb_kupdate+0x0/0xdb
[  140.108063]  [<c0138643>] kthread+0x38/0x5d
[  140.108063]  [<c013860b>] kthread+0x0/0x5d
[  140.108063]  [<c01094f3>] kernel_thread_helper+0x7/0x10
[  140.108063]  =======================
[  140.108063] Code: 93 4c 01 00 00 52 50 68 42 76 8e e0 eb 4e 8d 83 b0 00 00 
00 e8 32 71 96 df 8d 93 4c 01 00 00 52 50 68 7c 76 8e e0 eb 59 8b 1c 24 <8b> 93 
58 03 00 00 8b 82 84 00 00 00 8b 1a 8b 70 60 85 f6 74 29 
[  140.108063] EIP: [<e08e2670>] sym_int_sir+0x547/0x118f [sym53c8xx] SS:ESP 
0068:de0f3ba0
[  140.162446] Kernel panic - not syncing: Fatal exception in interrupt

vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           L5420  @ 2.50GHz
stepping        : 6
cpu MHz         : 2500.087
cache size      : 6144 KB
[...]
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2
ssse3 cx16 xtpr dca sse4_1 lahf_lm
bogomips        : 5000.23
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

-- 
John Morrissey          _o            /\         ----  __o
j...@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to