Re: Bug report: Page allocation failure with virtio-net in kvm guest on 2.6.32-220.4.1

2012-02-03 Thread Benjamin Reiter, Aginion IT-Consulting
To verify whether this is a hardware or configuration problem on my 
side, do SL 6.2 guests on SL 6.2 hosts work reliably without virtio-net 
hiccups for other people?


Even when they are stressed with a bit of network traffic? (~100 GB/hour)

Any reports are highly appreciated.



 Original Message 
Subject: Bug report: Page allocation failure with virtio-net in kvm 
guest on 2.6.32-220.4.1

Date: Thu, 02 Feb 2012 16:21:38 +0100
From: Benjamin Reiter, Aginion IT-Consulting b.rei...@aginion.de
To: scientific-linux-us...@fnal.gov

Page allocation failure with virtio-net in kvm guest on 2.6.32-220.4.1

Reproducibly after a couple minutes or hours and 100 MB - 30GB of
network traffic (NFS) the network interface in the guest goes down. The
guest can be shut down from the host via acpi event.

This does only happen with the virtio net driver, with e1000 the guest
is stable for days.

Host and guest run 2.6.32-220.4.1.el6.x86_64

Host runs kvm version 0.12.1.2-2.209.el6_2.4.x86_64




Feb  2 13:04:02 host656 kernel: rpciod/0: page allocation failure.
order:0, mode:0x20
Feb  2 13:04:02 host656 kernel: Pid: 1081, comm: rpciod/0 Not tainted
2.6.32-220.4.1.el6.x86_64 #1
Feb  2 13:04:02 host656 kernel: Call Trace:
Feb  2 13:04:02 host656 kernel: IRQ  [81123daf] ?
__alloc_pages_nodemask+0x77f/0x940
Feb  2 13:04:02 host656 kernel: [81158a1a] ?
alloc_pages_current+0xaa/0x110
Feb  2 13:04:02 host656 kernel: [a0108d22] ?
try_fill_recv+0x262/0x280 [virtio_net]
Feb  2 13:04:02 host656 kernel: [8142df18] ?
netif_receive_skb+0x58/0x60
Feb  2 13:04:02 host656 kernel: [a01091fd] ?
virtnet_poll+0x42d/0x8d0 [virtio_net]
Feb  2 13:04:02 host656 kernel: [814307c3] ?
net_rx_action+0x103/0x2f0
Feb  2 13:04:02 host656 kernel: [81072001] ?
__do_softirq+0xc1/0x1d0
Feb  2 13:04:02 host656 kernel: [8100c24c] ?
call_softirq+0x1c/0x30
Feb  2 13:04:02 host656 kernel: EOI  [8100de85] ?
do_softirq+0x65/0xa0
Feb  2 13:04:02 host656 kernel: [81071f0a] ?
local_bh_enable+0x9a/0xb0
Feb  2 13:04:02 host656 kernel: [8147a8e7] ?
tcp_rcv_established+0x107/0x800
Feb  2 13:04:02 host656 kernel: [81482c13] ?
tcp_v4_do_rcv+0x2e3/0x430
Feb  2 13:04:02 host656 kernel: [8147ead6] ?
tcp_write_xmit+0x1f6/0x9e0
Feb  2 13:04:02 host656 kernel: [8141cc75] ?
release_sock+0x65/0xe0
Feb  2 13:04:02 host656 kernel: [8146fb4c] ?
tcp_sendmsg+0x73c/0xa10
Feb  2 13:04:02 host656 kernel: [81419a0a] ?
sock_sendmsg+0x11a/0x150
Feb  2 13:04:02 host656 kernel: [81038488] ?
pvclock_clocksource_read+0x58/0xd0
Feb  2 13:04:02 host656 kernel: [81090a90] ?
autoremove_wake_function+0x0/0x40
Feb  2 13:04:02 host656 kernel: [81061c95] ?
enqueue_entity+0x125/0x420
Feb  2 13:04:02 host656 kernel: [81419a81] ?
kernel_sendmsg+0x41/0x60
Feb  2 13:04:02 host656 kernel: [a018ab6e] ?
xs_send_kvec+0x8e/0xa0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018acf3] ?
xs_sendpages+0x173/0x220 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018aedd] ?
xs_tcp_send_request+0x5d/0x160 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a0188e63] ?
xprt_transmit+0x83/0x2e0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a0185c48] ?
call_transmit+0x1d8/0x2c0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e23e] ?
__rpc_execute+0x5e/0x2a0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e4d0] ?
rpc_async_schedule+0x0/0x20 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e4e5] ?
rpc_async_schedule+0x15/0x20 [sunrpc]
Feb  2 13:04:02 host656 kernel: [8108b150] ?
worker_thread+0x170/0x2a0
Feb  2 13:04:02 host656 kernel: [81090a90] ?
autoremove_wake_function+0x0/0x40
Feb  2 13:04:02 host656 kernel: [8108afe0] ?
worker_thread+0x0/0x2a0
Feb  2 13:04:02 host656 kernel: [81090726] ? kthread+0x96/0xa0
Feb  2 13:04:02 host656 kernel: [8100c14a] ? child_rip+0xa/0x20
Feb  2 13:04:02 host656 kernel: [81090690] ? kthread+0x0/0xa0
Feb  2 13:04:02 host656 kernel: [8100c140] ? child_rip+0x0/0x20
...


VM is started with:

qemu  2347 61.7  3.7 537704 281556 ?   Sl   13:09  67:29
/usr/libexec/qemu-kvm -S -M rhel6.2.0 -enable-kvm -m 256 -smp
1,sockets=1,cores=1,threads=1 -name kvm_host656.net31 -uuid
97eae23f-bb13-58da-b4bc-258c6bf275a2 -nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvm_host656.net31.monitor,server,nowait 


-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -drive
file=/dev/disk/by-path/ip-10.224.2.20:3260-iscsi-iqn.1986-03.com.sun:02:e9e63ad1-3f29-4d5c-9da9-b10e44a1520f.vmstore12.net31-lun-1,if=none,id=drive-virtio-disk0,format=raw,cache=none 


-device
virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 


-netdev tap,fd=21,id=hostnet0 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:6a:c7:d8,bus=pci.0,addr=0x3 


-chardev pty,id

Re: Bug report: Page allocation failure with virtio-net in kvm guest on 2.6.32-220.4.1

2012-02-03 Thread Benjamin Reiter, Aginion IT-Consulting

Dne 4.2.2012 3:34, Benjamin Reiter, Aginion IT-Consulting napsal(a):

To verify whether this is a hardware or configuration problem on my
side, do SL 6.2 guests on SL 6.2 hosts work reliably without virtio-net
hiccups for other people?

Even when they are stressed with a bit of network traffic? (~100 GB/hour)

Any reports are highly appreciated.



 Original Message 
Subject: Bug report: Page allocation failure with virtio-net in kvm
guest on 2.6.32-220.4.1
Date: Thu, 02 Feb 2012 16:21:38 +0100
From: Benjamin Reiter, Aginion IT-Consulting
b.reiter-jxsx9qjxn1celga04la...@public.gmane.org
To: scientific-linux-users-13hema8v...@public.gmane.org

Page allocation failure with virtio-net in kvm guest on 2.6.32-220.4.1

Reproducibly after a couple minutes or hours and 100 MB - 30GB of
network traffic (NFS) the network interface in the guest goes down. The
guest can be shut down from the host via acpi event.

This does only happen with the virtio net driver, with e1000 the guest
is stable for days.

Host and guest run 2.6.32-220.4.1.el6.x86_64

Host runs kvm version 0.12.1.2-2.209.el6_2.4.x86_64




Feb  2 13:04:02 host656 kernel: rpciod/0: page allocation failure.
order:0, mode:0x20
Feb  2 13:04:02 host656 kernel: Pid: 1081, comm: rpciod/0 Not tainted
2.6.32-220.4.1.el6.x86_64 #1
Feb  2 13:04:02 host656 kernel: Call Trace:
Feb  2 13:04:02 host656 kernel:IRQ   [81123daf] ?
__alloc_pages_nodemask+0x77f/0x940
Feb  2 13:04:02 host656 kernel: [81158a1a] ?
alloc_pages_current+0xaa/0x110
Feb  2 13:04:02 host656 kernel: [a0108d22] ?
try_fill_recv+0x262/0x280 [virtio_net]
Feb  2 13:04:02 host656 kernel: [8142df18] ?
netif_receive_skb+0x58/0x60
Feb  2 13:04:02 host656 kernel: [a01091fd] ?
virtnet_poll+0x42d/0x8d0 [virtio_net]
Feb  2 13:04:02 host656 kernel: [814307c3] ?
net_rx_action+0x103/0x2f0
Feb  2 13:04:02 host656 kernel: [81072001] ?
__do_softirq+0xc1/0x1d0
Feb  2 13:04:02 host656 kernel: [8100c24c] ?
call_softirq+0x1c/0x30
Feb  2 13:04:02 host656 kernel:EOI   [8100de85] ?
do_softirq+0x65/0xa0
Feb  2 13:04:02 host656 kernel: [81071f0a] ?
local_bh_enable+0x9a/0xb0
Feb  2 13:04:02 host656 kernel: [8147a8e7] ?
tcp_rcv_established+0x107/0x800
Feb  2 13:04:02 host656 kernel: [81482c13] ?
tcp_v4_do_rcv+0x2e3/0x430
Feb  2 13:04:02 host656 kernel: [8147ead6] ?
tcp_write_xmit+0x1f6/0x9e0
Feb  2 13:04:02 host656 kernel: [8141cc75] ?
release_sock+0x65/0xe0
Feb  2 13:04:02 host656 kernel: [8146fb4c] ?
tcp_sendmsg+0x73c/0xa10
Feb  2 13:04:02 host656 kernel: [81419a0a] ?
sock_sendmsg+0x11a/0x150
Feb  2 13:04:02 host656 kernel: [81038488] ?
pvclock_clocksource_read+0x58/0xd0
Feb  2 13:04:02 host656 kernel: [81090a90] ?
autoremove_wake_function+0x0/0x40
Feb  2 13:04:02 host656 kernel: [81061c95] ?
enqueue_entity+0x125/0x420
Feb  2 13:04:02 host656 kernel: [81419a81] ?
kernel_sendmsg+0x41/0x60
Feb  2 13:04:02 host656 kernel: [a018ab6e] ?
xs_send_kvec+0x8e/0xa0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018acf3] ?
xs_sendpages+0x173/0x220 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018aedd] ?
xs_tcp_send_request+0x5d/0x160 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a0188e63] ?
xprt_transmit+0x83/0x2e0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a0185c48] ?
call_transmit+0x1d8/0x2c0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e23e] ?
__rpc_execute+0x5e/0x2a0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e4d0] ?
rpc_async_schedule+0x0/0x20 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e4e5] ?
rpc_async_schedule+0x15/0x20 [sunrpc]
Feb  2 13:04:02 host656 kernel: [8108b150] ?
worker_thread+0x170/0x2a0
Feb  2 13:04:02 host656 kernel: [81090a90] ?
autoremove_wake_function+0x0/0x40
Feb  2 13:04:02 host656 kernel: [8108afe0] ?
worker_thread+0x0/0x2a0
Feb  2 13:04:02 host656 kernel: [81090726] ? kthread+0x96/0xa0
Feb  2 13:04:02 host656 kernel: [8100c14a] ? child_rip+0xa/0x20
Feb  2 13:04:02 host656 kernel: [81090690] ? kthread+0x0/0xa0
Feb  2 13:04:02 host656 kernel: [8100c140] ? child_rip+0x0/0x20
...


VM is started with:

qemu  2347 61.7  3.7 537704 281556 ?   Sl   13:09  67:29
/usr/libexec/qemu-kvm -S -M rhel6.2.0 -enable-kvm -m 256 -smp
1,sockets=1,cores=1,threads=1 -name kvm_host656.net31 -uuid
97eae23f-bb13-58da-b4bc-258c6bf275a2 -nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/kvm_host656.net31.monitor,server,nowait

-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -drive
file=/dev/disk/by-path/ip-10.224.2.20:3260-iscsi-iqn.1986-03.com.sun:02:e9e63ad1-3f29-4d5c-9da9-b10e44a1520f.vmstore12.net31-lun-1,if=none,id=drive-virtio-disk0,format=raw,cache=none

-device
virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1

-netdev tap,fd=21,id=hostnet0

Bug report: Page allocation failure with virtio-net in kvm guest on 2.6.32-220.4.1

2012-02-02 Thread Benjamin Reiter, Aginion IT-Consulting

Page allocation failure with virtio-net in kvm guest on 2.6.32-220.4.1

Reproducibly after a couple minutes or hours and 100 MB - 30GB of 
network traffic (NFS) the network interface in the guest goes down. The 
guest can be shut down from the host via acpi event.


This does only happen with the virtio net driver, with e1000 the guest 
is stable for days.


Host and guest run 2.6.32-220.4.1.el6.x86_64

Host runs kvm version 0.12.1.2-2.209.el6_2.4.x86_64





Feb  2 13:04:02 host656 kernel: rpciod/0: page allocation failure. 
order:0, mode:0x20
Feb  2 13:04:02 host656 kernel: Pid: 1081, comm: rpciod/0 Not tainted 
2.6.32-220.4.1.el6.x86_64 #1

Feb  2 13:04:02 host656 kernel: Call Trace:
Feb  2 13:04:02 host656 kernel: IRQ  [81123daf] ? 
__alloc_pages_nodemask+0x77f/0x940
Feb  2 13:04:02 host656 kernel: [81158a1a] ? 
alloc_pages_current+0xaa/0x110
Feb  2 13:04:02 host656 kernel: [a0108d22] ? 
try_fill_recv+0x262/0x280 [virtio_net]
Feb  2 13:04:02 host656 kernel: [8142df18] ? 
netif_receive_skb+0x58/0x60
Feb  2 13:04:02 host656 kernel: [a01091fd] ? 
virtnet_poll+0x42d/0x8d0 [virtio_net]
Feb  2 13:04:02 host656 kernel: [814307c3] ? 
net_rx_action+0x103/0x2f0
Feb  2 13:04:02 host656 kernel: [81072001] ? 
__do_softirq+0xc1/0x1d0
Feb  2 13:04:02 host656 kernel: [8100c24c] ? 
call_softirq+0x1c/0x30
Feb  2 13:04:02 host656 kernel: EOI  [8100de85] ? 
do_softirq+0x65/0xa0
Feb  2 13:04:02 host656 kernel: [81071f0a] ? 
local_bh_enable+0x9a/0xb0
Feb  2 13:04:02 host656 kernel: [8147a8e7] ? 
tcp_rcv_established+0x107/0x800
Feb  2 13:04:02 host656 kernel: [81482c13] ? 
tcp_v4_do_rcv+0x2e3/0x430
Feb  2 13:04:02 host656 kernel: [8147ead6] ? 
tcp_write_xmit+0x1f6/0x9e0
Feb  2 13:04:02 host656 kernel: [8141cc75] ? 
release_sock+0x65/0xe0
Feb  2 13:04:02 host656 kernel: [8146fb4c] ? 
tcp_sendmsg+0x73c/0xa10
Feb  2 13:04:02 host656 kernel: [81419a0a] ? 
sock_sendmsg+0x11a/0x150
Feb  2 13:04:02 host656 kernel: [81038488] ? 
pvclock_clocksource_read+0x58/0xd0
Feb  2 13:04:02 host656 kernel: [81090a90] ? 
autoremove_wake_function+0x0/0x40
Feb  2 13:04:02 host656 kernel: [81061c95] ? 
enqueue_entity+0x125/0x420
Feb  2 13:04:02 host656 kernel: [81419a81] ? 
kernel_sendmsg+0x41/0x60
Feb  2 13:04:02 host656 kernel: [a018ab6e] ? 
xs_send_kvec+0x8e/0xa0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018acf3] ? 
xs_sendpages+0x173/0x220 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018aedd] ? 
xs_tcp_send_request+0x5d/0x160 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a0188e63] ? 
xprt_transmit+0x83/0x2e0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a0185c48] ? 
call_transmit+0x1d8/0x2c0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e23e] ? 
__rpc_execute+0x5e/0x2a0 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e4d0] ? 
rpc_async_schedule+0x0/0x20 [sunrpc]
Feb  2 13:04:02 host656 kernel: [a018e4e5] ? 
rpc_async_schedule+0x15/0x20 [sunrpc]
Feb  2 13:04:02 host656 kernel: [8108b150] ? 
worker_thread+0x170/0x2a0
Feb  2 13:04:02 host656 kernel: [81090a90] ? 
autoremove_wake_function+0x0/0x40
Feb  2 13:04:02 host656 kernel: [8108afe0] ? 
worker_thread+0x0/0x2a0

Feb  2 13:04:02 host656 kernel: [81090726] ? kthread+0x96/0xa0
Feb  2 13:04:02 host656 kernel: [8100c14a] ? child_rip+0xa/0x20
Feb  2 13:04:02 host656 kernel: [81090690] ? kthread+0x0/0xa0
Feb  2 13:04:02 host656 kernel: [8100c140] ? child_rip+0x0/0x20
Feb  2 13:04:02 host656 kernel: rpciod/0: page allocation failure. 
order:0, mode:0x20
Feb  2 13:04:02 host656 kernel: Pid: 1081, comm: rpciod/0 Not tainted 
2.6.32-220.4.1.el6.x86_64 #1

Feb  2 13:04:02 host656 kernel: Call Trace:
Feb  2 13:04:02 host656 kernel: IRQ  [81123daf] ? 
__alloc_pages_nodemask+0x77f/0x940
Feb  2 13:04:02 host656 kernel: [8115dc62] ? 
kmem_getpages+0x62/0x170
Feb  2 13:04:02 host656 kernel: [8115e87a] ? 
fallback_alloc+0x1ba/0x270
Feb  2 13:04:02 host656 kernel: [8115e2cf] ? 
cache_grow+0x2cf/0x320
Feb  2 13:04:02 host656 kernel: [8115e5f9] ? 
cache_alloc_node+0x99/0x160
Feb  2 13:04:02 host656 kernel: [8142186a] ? 
__alloc_skb+0x7a/0x180
Feb  2 13:04:02 host656 kernel: [8115f4bf] ? 
kmem_cache_alloc_node_notrace+0x6f/0x130
Feb  2 13:04:02 host656 kernel: [8115f6fb] ? 
__kmalloc_node+0x7b/0x100
Feb  2 13:04:02 host656 kernel: [8142186a] ? 
__alloc_skb+0x7a/0x180
Feb  2 13:04:02 host656 kernel: [814219e6] ? 
__netdev_alloc_skb+0x36/0x60
Feb  2 13:04:02 host656 kernel: [a0108f82] ? 
virtnet_poll+0x1b2/0x8d0 [virtio_net]
Feb  2 13:04:02 host656 kernel: [814307c3] ? 
net_rx_action+0x103/0x2f0
Feb  2 13:04:02 host656 kernel: [81072001] ? 
__do_softirq+0xc1/0x1d0
Feb  2 13:04:02 host656 kernel: [8100c24c] ? 
call_softirq+0x1c/0x30
Feb  2