Avi Kivity wrote: > david ahern wrote: >> I've run a lot more tests: >> >> >> - if I remove the "if (!change) return" optimization from pci_set_irq the >> rtl8139 nic worked fine for 16+ hours. I'm not recommending this as a >> fix, just >> confirming that the problem goes away. >> >> > > Interesting. What can cause this to happen? > > - some non-pci device shares the same irq (unlikely) > > - the pci link sharing is broken. Is the eth0 irq shared?
interrupt is not shared. > > Please post /proc/interrupts. # cat /proc/interrupts CPU0 CPU1 0: 10566 46468 IO-APIC-edge timer 1: 5 5 IO-APIC-edge i8042 8: 0 1 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 11: 243118 5656 IO-APIC-level eth0 12: 180 45 IO-APIC-edge i8042 14: 2021 12592 IO-APIC-edge ide0 15: 14 10 IO-APIC-edge ide1 NMI: 0 0 LOC: 56947 56946 ERR: 0 MIS: 31 > > - the in-kernel ioapic is buggy and needs the extra kicking the > optimization prevents. Can be checked by re-adding the optimization to > kvm_ioapic_set_irq() (keeping it removed in qemu). If it works, the > problem is in userspace. If it fails, the problem is in the kernel. > > Something like > > static int old_level[16]; > > if (level == old_level[irq]) > return; > old_level[irq] = level; > > > I'll give this a shot and let you know. If you are interested, here's some more info on the -no-kvm-irqchip option: qemu ends up spinning with 1 thread consuming 100% cpu. Output from top (literally the top 11 lines) with 'show threads' and individual cpu stats: Tasks: 125 total, 2 running, 123 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 1.0%us, 0.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4046804k total, 4013480k used, 33324k free, 42512k buffers Swap: 2096472k total, 120k used, 2096352k free, 1159892k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4441 root 20 0 2675m 2.5g 9808 R 100 65.0 499:34.09 qemu-system-x86 4426 root 20 0 2675m 2.5g 9808 S 1 65.0 16:24.50 qemu-system-x86 ... Hooking up gdb shows it cycling with the following backtrace: (gdb) bt #0 0x00002ad97b5ee3e8 in do_sigtimedwait () from /lib64/libc.so.6 #1 0x00002ad97b5ee4ae in sigtimedwait () from /lib64/libc.so.6 #2 0x00000000004fb7df in kvm_eat_signal (env=0x2ade460, timeout=10) at /opt/kvm/kvm-61/qemu/qemu-kvm.c:156 #3 0x00000000004fb9e4 in kvm_eat_signals (env=0x2ade460, timeout=10) at /opt/kvm/kvm-61/qemu/qemu-kvm.c:192 #4 0x00000000004fba49 in kvm_main_loop_wait (env=0x2ade460, timeout=10) at /opt/kvm/kvm-61/qemu/qemu-kvm.c:211 #5 0x00000000004fc278 in kvm_main_loop_cpu (env=0x2ade460) at /opt/kvm/kvm-61/qemu/qemu-kvm.c:299 #6 0x000000000040ff2d in main (argc=<value optimized out>, argv=0x7fff304607b8) at /opt/kvm/kvm-61/qemu/vl.c:7856 I have a dump of CPUX86State *env if you want to see it. david ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel