Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
Chris Webb wrote: > I found that on my laptop, the single change of host kernel config > > -CONFIG_INTEL_IDLE=y > +# CONFIG_INTEL_IDLE is not set > > is sufficient to turn transfers into guests from slow to full wire speed I am not deep enough in this code to write a patch, but I wonder if macvtap_forward in macvtap.c is missing a call to kill_fasync, which I understand is used to signal to interested processes when data arrives? Here is the end of macvtap_forward: skb_queue_tail(&q->sk.sk_receive_queue, skb); wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | POLLRDNORM | POLLRDBAND); return NET_RX_SUCCESS; Compared to this end of tun_net_xmit in tun.c: /* Enqueue packet */ skb_queue_tail(&tun->socket.sk->sk_receive_queue, skb); /* Notify and wake up reader process */ if (tun->flags & TUN_FASYNC) kill_fasync(&tun->fasync, SIGIO, POLL_IN); wake_up_interruptible_poll(&tun->wq.wait, POLLIN | POLLRDNORM | POLLRDBAND); return NETDEV_TX_OK; Richard.
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
On Thu, Aug 30, 2012 at 09:20:57AM +0100, Richard Davies wrote: > Chris Webb wrote: > > I found that on my laptop, the single change of host kernel config > > > > -CONFIG_INTEL_IDLE=y > > +# CONFIG_INTEL_IDLE is not set > > > > is sufficient to turn transfers into guests from slow to full wire speed > > I am not deep enough in this code to write a patch, but I wonder if > macvtap_forward in macvtap.c is missing a call to kill_fasync, which I > understand is used to signal to interested processes when data arrives? > No, only if TUN_FASYNC is set. qemu does not seem to set it. > Here is the end of macvtap_forward: > > skb_queue_tail(&q->sk.sk_receive_queue, skb); > wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | POLLRDNORM | > POLLRDBAND); > return NET_RX_SUCCESS; > > > Compared to this end of tun_net_xmit in tun.c: > > /* Enqueue packet */ > skb_queue_tail(&tun->socket.sk->sk_receive_queue, skb); > > /* Notify and wake up reader process */ > if (tun->flags & TUN_FASYNC) > kill_fasync(&tun->fasync, SIGIO, POLL_IN); > wake_up_interruptible_poll(&tun->wq.wait, POLLIN | > POLLRDNORM | POLLRDBAND); > return NETDEV_TX_OK; > > > Richard.
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
Chris Webb writes: > I'm experiencing a problem with qemu + macvtap which I can reproduce on a > variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to > 3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1. > > Large data transfers over TCP into a guest from another machine on the > network are very slow (often less than 100kB/s) whereas transfers outbound > from the guest, between two guests on the same host, or between the guest > and its host run at normal speeds (>= 50MB/s). > > The slow inbound data transfer speeds up substantially when a ping flood is > aimed either at the host or the guest, or when the qemu process is straced. > Presumably both of these are ways to wake up something that is otherwise > sleeping too long? I thought I'd try bisecting from when macvtap was introduced (2.6.34 where it presumably worked fine), but in preparing to do that, I stumbled upon a way to change the behaviour from slow to fast with different kernel .configs. Pinning it down specifically, I found that on my laptop, the single change of host kernel config -CONFIG_INTEL_IDLE=y +# CONFIG_INTEL_IDLE is not set is sufficient to turn transfers into guests from slow to full wire speed. The .configs of the 'slow' and 'fast' host kernels are respectively at http://cdw.me.uk/tmp/goingslow.config http://cdw.me.uk/tmp/goingfast.config Our big servers that show the symptoms are Opteron 6128 boxes, and (perhaps unsurprisingly) aren't affected by CONFIG_INTEL_IDLE. In fact, turning off the whole of the CPU idle infrastructure as below didn't have any effect: transfers into the guest remained slow. @@ -441,10 +441,8 @@ CONFIG_ACPI=y # CONFIG_ACPI_BUTTON is not set CONFIG_ACPI_FAN=y CONFIG_ACPI_DOCK=y -CONFIG_ACPI_PROCESSOR=y +# CONFIG_ACPI_PROCESSOR is not set CONFIG_ACPI_IPMI=y -CONFIG_ACPI_PROCESSOR_AGGREGATOR=y -CONFIG_ACPI_THERMAL=y CONFIG_ACPI_NUMA=y # CONFIG_ACPI_CUSTOM_DSDT is not set CONFIG_ACPI_BLACKLIST_YEAR=0 @@ -463,16 +461,12 @@ CONFIG_SFI=y # CPU Frequency scaling # # CONFIG_CPU_FREQ is not set -CONFIG_CPU_IDLE=y -CONFIG_CPU_IDLE_GOV_LADDER=y -CONFIG_CPU_IDLE_GOV_MENU=y -CONFIG_INTEL_IDLE=y +# CONFIG_CPU_IDLE is not set # # Memory power savings # -CONFIG_I7300_IDLE_IOAT_CHANNEL=y -CONFIG_I7300_IDLE=y +# CONFIG_I7300_IDLE is not set # # Bus options (PCI etc.) (From earlier in the thread, the full unmodified kernel config for one of these boxes is at http://cdw.me.uk/tmp/server-config.txt and I've tested equivalent configs on kernels from 3.0.x to the head of Linus' git tree.) Is the CONFIG_INTEL_IDLE thing suggestive of what the problem might be at all? Out of interest, what is the impact of disabling this on Intel machines? Presumably in this case we rely on ACPI for processor idle? Also, is there an equivalent for AMD machines which we could disable on our large Opteron boxes, given that even getting rid of the whole CONFIG_CPU_IDLE machinery didn't change anything as CONFIG_INTEL_IDLE did on the Intel box? Best wishes, Chris.
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
On Thu, Aug 16, 2012 at 03:27:57PM +0100, Chris Webb wrote: > "Michael S. Tsirkin" writes: > > > On Thu, Aug 16, 2012 at 10:20:05AM +0100, Chris Webb wrote: > > > > > For example, I can run > > > > > > ip addr add 192.168.1.2/24 dev eth0 > > > ip link set eth0 up > > > ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap > > > mode bridge > > > ip link set tap0 up > > > qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ > > > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ > > > -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) > > > > > > on one physical host which is otherwise completely idle. From a second > > > physical host on the same network, I then scp a large (say 50MB) file onto > > > the new guest. On a gigabit LAN, speeds consistently drop to less than > > > 100kB/s as the transfer progresses, within a second of starting. > > > Thanks for the report. > > I'll try to reproduce this early next week. > > Meanwhile a question - do you still observe this behaviour if you enable > > vhost-net? > > I haven't tried running with vhost-net before. Is it sufficient to compile > the host kernel with CONFIG_VHOST_NET=y and boot the guest with > > qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ > -net tap,fd=3,vhost=on,vhostfd=4 \ > 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) 4<>/dev/vhost-net > > ? If so, then I'm afraid this doesn't make any difference: it still stalls > and drops right down in speed. > > The reason I'm hesitant about whether the vhost-net is actually working is > that with both vhost=off and vhost=on, I see an identical virtio feature set > within the guest: > > # cat /sys/bus/virtio/devices/virtio0/features > 01111000 Yes that is expected. > However, without the 4<>/dev/vhost-net or with 4<>/dev/null, it seems to > fail to start altogether with vhost=on,vhostfd=4, so perhaps it's fine? > > Cheers, > > Chris.
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
"Michael S. Tsirkin" writes: > On Thu, Aug 16, 2012 at 10:20:05AM +0100, Chris Webb wrote: > > > For example, I can run > > > > ip addr add 192.168.1.2/24 dev eth0 > > ip link set eth0 up > > ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap > > mode bridge > > ip link set tap0 up > > qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ > > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ > > -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) > > > > on one physical host which is otherwise completely idle. From a second > > physical host on the same network, I then scp a large (say 50MB) file onto > > the new guest. On a gigabit LAN, speeds consistently drop to less than > > 100kB/s as the transfer progresses, within a second of starting. > Thanks for the report. > I'll try to reproduce this early next week. > Meanwhile a question - do you still observe this behaviour if you enable > vhost-net? I haven't tried running with vhost-net before. Is it sufficient to compile the host kernel with CONFIG_VHOST_NET=y and boot the guest with qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ -net tap,fd=3,vhost=on,vhostfd=4 \ 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) 4<>/dev/vhost-net ? If so, then I'm afraid this doesn't make any difference: it still stalls and drops right down in speed. The reason I'm hesitant about whether the vhost-net is actually working is that with both vhost=off and vhost=on, I see an identical virtio feature set within the guest: # cat /sys/bus/virtio/devices/virtio0/features 01111000 However, without the 4<>/dev/vhost-net or with 4<>/dev/null, it seems to fail to start altogether with vhost=on,vhostfd=4, so perhaps it's fine? Cheers, Chris.
Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces
On Thu, Aug 16, 2012 at 10:20:05AM +0100, Chris Webb wrote: > I'm experiencing a problem with qemu + macvtap which I can reproduce on a > variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to > 3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1. > > Large data transfers over TCP into a guest from another machine on the > network are very slow (often less than 100kB/s) whereas transfers outbound > from the guest, between two guests on the same host, or between the guest > and its host run at normal speeds (>= 50MB/s). > > The slow inbound data transfer speeds up substantially when a ping flood is > aimed either at the host or the guest, or when the qemu process is straced. > Presumably both of these are ways to wake up something that is otherwise > sleeping too long? > > For example, I can run > > ip addr add 192.168.1.2/24 dev eth0 > ip link set eth0 up > ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap mode > bridge > ip link set tap0 up > qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ > -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) > > on one physical host which is otherwise completely idle. From a second > physical host on the same network, I then scp a large (say 50MB) file onto > the new guest. On a gigabit LAN, speeds consistently drop to less than > 100kB/s as the transfer progresses, within a second of starting. > > The choice of virtio virtual nic in the above isn't significant: the same > thing > happens with e1000 or rtl8139. You can also replace the scp with a straight > netcat and see the same effect. > > Doing the transfer in the other direction (i.e. copying a large file from the > guest to an external host) achieves 50MB/s or faster as expected. Copying > between two guests on the same host (i.e. taking advantage of the 'mode > bridge') is also fast. > > If I create a macvlan device attached to eth0 and move the host IP address to > that, I can communicate between the host itself and the guest because of the > 'mode bridge'. Again, this case is fast in both directions. > > Using a bridge and a standard tap interface, transfers in and out are fast > too: > > ip tuntap add tap0 mode tap > brctl addbr br0 > brctl addif br0 eth0 > brctl addif br0 tap1 > ip link set eth0 up > ip link set tap0 up > ip link set br0 up > ip addr add 192.168.1.2/24 dev br0 > qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ > -net tap,script=no,downscript=no,ifname=tap0 > > As mentioned in the summary at the beginning of this report, when I strace a > guest in the original configuration which is receiving data slowly, the data > rate improves from less than 100kB/s to around 3.1MB/s. Similarly, if I ping > flood either the guest or the host it is running on from another machine on > the network, the transfer rate improves to around 1.1MB/s. This seems quite > suggestive of a problem with delayed wake-up of the guest. > > Two reasonably up-to-date examples of machines I've reproduced this on are > my laptop with an r8169 gigabit ethernet card, Debian qemu-kvm 1.0 and > upstream 3.4.8 kernel whose .config and boot dmesg are at > > http://cdw.me.uk/tmp/laptop-config.txt > http://cdw.me.uk/tmp/laptop-dmesg.txt > > and one of our large servers with an igb gigabit ethernet card, upstream > qemu-kvm 1.1.1 and upstream 3.5.1 linux: > > http://cdw.me.uk/tmp/server-config.txt > http://cdw.me.uk/tmp/server-dmesg.txt > > For completeness, I've put the Debian 6 test image I've been using for > testing at > > http://cdw.me.uk/tmp/test-debian.img.xz > > though I've see the same problem from a variety of guest operating systems. > (In fact, I've not yet found any combination of host kernel, guest OS and > hardware which doesn't show these symptoms, so it seems to be very easy to > reproduce.) > > Cheers, > > Chris. Thanks for the report. I'll try to reproduce this early next week. Meanwhile a question - do you still observe this behaviour if you enable vhost-net? Thanks, -- MST
[Qemu-devel] Slow inbound traffic on macvtap interfaces
I'm experiencing a problem with qemu + macvtap which I can reproduce on a variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to 3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1. Large data transfers over TCP into a guest from another machine on the network are very slow (often less than 100kB/s) whereas transfers outbound from the guest, between two guests on the same host, or between the guest and its host run at normal speeds (>= 50MB/s). The slow inbound data transfer speeds up substantially when a ping flood is aimed either at the host or the guest, or when the qemu process is straced. Presumably both of these are ways to wake up something that is otherwise sleeping too long? For example, I can run ip addr add 192.168.1.2/24 dev eth0 ip link set eth0 up ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap mode bridge ip link set tap0 up qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) on one physical host which is otherwise completely idle. From a second physical host on the same network, I then scp a large (say 50MB) file onto the new guest. On a gigabit LAN, speeds consistently drop to less than 100kB/s as the transfer progresses, within a second of starting. The choice of virtio virtual nic in the above isn't significant: the same thing happens with e1000 or rtl8139. You can also replace the scp with a straight netcat and see the same effect. Doing the transfer in the other direction (i.e. copying a large file from the guest to an external host) achieves 50MB/s or faster as expected. Copying between two guests on the same host (i.e. taking advantage of the 'mode bridge') is also fast. If I create a macvlan device attached to eth0 and move the host IP address to that, I can communicate between the host itself and the guest because of the 'mode bridge'. Again, this case is fast in both directions. Using a bridge and a standard tap interface, transfers in and out are fast too: ip tuntap add tap0 mode tap brctl addbr br0 brctl addif br0 eth0 brctl addif br0 tap1 ip link set eth0 up ip link set tap0 up ip link set br0 up ip addr add 192.168.1.2/24 dev br0 qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \ -net nic,model=virtio,macaddr=02:02:02:02:02:02 \ -net tap,script=no,downscript=no,ifname=tap0 As mentioned in the summary at the beginning of this report, when I strace a guest in the original configuration which is receiving data slowly, the data rate improves from less than 100kB/s to around 3.1MB/s. Similarly, if I ping flood either the guest or the host it is running on from another machine on the network, the transfer rate improves to around 1.1MB/s. This seems quite suggestive of a problem with delayed wake-up of the guest. Two reasonably up-to-date examples of machines I've reproduced this on are my laptop with an r8169 gigabit ethernet card, Debian qemu-kvm 1.0 and upstream 3.4.8 kernel whose .config and boot dmesg are at http://cdw.me.uk/tmp/laptop-config.txt http://cdw.me.uk/tmp/laptop-dmesg.txt and one of our large servers with an igb gigabit ethernet card, upstream qemu-kvm 1.1.1 and upstream 3.5.1 linux: http://cdw.me.uk/tmp/server-config.txt http://cdw.me.uk/tmp/server-dmesg.txt For completeness, I've put the Debian 6 test image I've been using for testing at http://cdw.me.uk/tmp/test-debian.img.xz though I've see the same problem from a variety of guest operating systems. (In fact, I've not yet found any combination of host kernel, guest OS and hardware which doesn't show these symptoms, so it seems to be very easy to reproduce.) Cheers, Chris.