Avi Kivity wrote:
> david ahern wrote:
>> I've run a lot more tests:
>>
>>
>> - if I remove the "if (!change) return" optimization from pci_set_irq the
>> rtl8139 nic worked fine for 16+ hours. I'm not recommending this as a
>> fix, just
>> confirming that the problem goes away.
>>
>>   
> 
> Interesting.  What can cause this to happen?
> 
> - some non-pci device shares the same irq (unlikely)
> 
> - the pci link sharing is broken.  Is the eth0 irq shared?

interrupt is not shared.

> 
> Please post /proc/interrupts.

# cat /proc/interrupts
           CPU0       CPU1
  0:      10566      46468    IO-APIC-edge  timer
  1:          5          5    IO-APIC-edge  i8042
  8:          0          1    IO-APIC-edge  rtc
  9:          0          0   IO-APIC-level  acpi
 11:     243118       5656   IO-APIC-level  eth0
 12:        180         45    IO-APIC-edge  i8042
 14:       2021      12592    IO-APIC-edge  ide0
 15:         14         10    IO-APIC-edge  ide1
NMI:          0          0
LOC:      56947      56946
ERR:          0
MIS:         31


> 
> - the in-kernel ioapic is buggy and needs the extra kicking the
> optimization prevents.  Can be checked by re-adding the optimization to
> kvm_ioapic_set_irq() (keeping it removed in qemu).  If it works, the
> problem is in userspace.  If it fails, the problem is in the kernel.
> 
> Something like
> 
>  static int old_level[16];
> 
>  if (level == old_level[irq])
>     return;
>  old_level[irq] = level;
> 
> 
> 

I'll give this a shot and let you know.

If you are interested, here's some more info on the -no-kvm-irqchip option:
qemu ends up spinning with 1 thread consuming 100% cpu. Output from top
(literally the top 11 lines) with 'show threads' and individual cpu stats:

Tasks: 125 total,   2 running, 123 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4046804k total,  4013480k used,    33324k free,    42512k buffers
Swap:  2096472k total,      120k used,  2096352k free,  1159892k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

 4441 root      20   0 2675m 2.5g 9808 R  100 65.0 499:34.09 qemu-system-x86

 4426 root      20   0 2675m 2.5g 9808 S    1 65.0  16:24.50 qemu-system-x86

...


Hooking up gdb shows it cycling with the following backtrace:

(gdb) bt
#0  0x00002ad97b5ee3e8 in do_sigtimedwait () from /lib64/libc.so.6
#1  0x00002ad97b5ee4ae in sigtimedwait () from /lib64/libc.so.6
#2  0x00000000004fb7df in kvm_eat_signal (env=0x2ade460, timeout=10) at
/opt/kvm/kvm-61/qemu/qemu-kvm.c:156
#3  0x00000000004fb9e4 in kvm_eat_signals (env=0x2ade460, timeout=10)
    at /opt/kvm/kvm-61/qemu/qemu-kvm.c:192
#4  0x00000000004fba49 in kvm_main_loop_wait (env=0x2ade460, timeout=10)
    at /opt/kvm/kvm-61/qemu/qemu-kvm.c:211
#5  0x00000000004fc278 in kvm_main_loop_cpu (env=0x2ade460) at
/opt/kvm/kvm-61/qemu/qemu-kvm.c:299
#6  0x000000000040ff2d in main (argc=<value optimized out>, argv=0x7fff304607b8)
    at /opt/kvm/kvm-61/qemu/vl.c:7856

I have a dump of CPUX86State *env if you want to see it.

david

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to