Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-30 Thread Richard Davies
Chris Webb wrote:
> I found that on my laptop, the single change of host kernel config
>
> -CONFIG_INTEL_IDLE=y
> +# CONFIG_INTEL_IDLE is not set
>
> is sufficient to turn transfers into guests from slow to full wire speed

I am not deep enough in this code to write a patch, but I wonder if
macvtap_forward in macvtap.c is missing a call to kill_fasync, which I
understand is used to signal to interested processes when data arrives?


Here is the end of macvtap_forward:

  skb_queue_tail(&q->sk.sk_receive_queue, skb);
  wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | POLLRDNORM | 
POLLRDBAND);
  return NET_RX_SUCCESS;


Compared to this end of tun_net_xmit in tun.c:

  /* Enqueue packet */
  skb_queue_tail(&tun->socket.sk->sk_receive_queue, skb);

  /* Notify and wake up reader process */
  if (tun->flags & TUN_FASYNC)
  kill_fasync(&tun->fasync, SIGIO, POLL_IN);
  wake_up_interruptible_poll(&tun->wq.wait, POLLIN |
 POLLRDNORM | POLLRDBAND);
  return NETDEV_TX_OK;


Richard.



Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-30 Thread Michael S. Tsirkin
On Thu, Aug 30, 2012 at 09:20:57AM +0100, Richard Davies wrote:
> Chris Webb wrote:
> > I found that on my laptop, the single change of host kernel config
> >
> > -CONFIG_INTEL_IDLE=y
> > +# CONFIG_INTEL_IDLE is not set
> >
> > is sufficient to turn transfers into guests from slow to full wire speed
> 
> I am not deep enough in this code to write a patch, but I wonder if
> macvtap_forward in macvtap.c is missing a call to kill_fasync, which I
> understand is used to signal to interested processes when data arrives?
> 

No, only if TUN_FASYNC is set. qemu does not seem to set it.

> Here is the end of macvtap_forward:
> 
>   skb_queue_tail(&q->sk.sk_receive_queue, skb);
>   wake_up_interruptible_poll(sk_sleep(&q->sk), POLLIN | POLLRDNORM | 
> POLLRDBAND);
>   return NET_RX_SUCCESS;
> 
> 
> Compared to this end of tun_net_xmit in tun.c:
> 
>   /* Enqueue packet */
>   skb_queue_tail(&tun->socket.sk->sk_receive_queue, skb);
> 
>   /* Notify and wake up reader process */
>   if (tun->flags & TUN_FASYNC)
>   kill_fasync(&tun->fasync, SIGIO, POLL_IN);
>   wake_up_interruptible_poll(&tun->wq.wait, POLLIN |
>  POLLRDNORM | POLLRDBAND);
>   return NETDEV_TX_OK;
> 
> 
> Richard.



Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-29 Thread Chris Webb
Chris Webb  writes:

> I'm experiencing a problem with qemu + macvtap which I can reproduce on a
> variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to
> 3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1.
> 
> Large data transfers over TCP into a guest from another machine on the
> network are very slow (often less than 100kB/s) whereas transfers outbound
> from the guest, between two guests on the same host, or between the guest
> and its host run at normal speeds (>= 50MB/s).
> 
> The slow inbound data transfer speeds up substantially when a ping flood is
> aimed either at the host or the guest, or when the qemu process is straced.
> Presumably both of these are ways to wake up something that is otherwise
> sleeping too long?

I thought I'd try bisecting from when macvtap was introduced (2.6.34 where it
presumably worked fine), but in preparing to do that, I stumbled upon a way to
change the behaviour from slow to fast with different kernel .configs. Pinning
it down specifically, I found that on my laptop, the single change of host
kernel config

-CONFIG_INTEL_IDLE=y
+# CONFIG_INTEL_IDLE is not set

is sufficient to turn transfers into guests from slow to full wire speed.
The .configs of the 'slow' and 'fast' host kernels are respectively at

  http://cdw.me.uk/tmp/goingslow.config
  http://cdw.me.uk/tmp/goingfast.config

Our big servers that show the symptoms are Opteron 6128 boxes, and (perhaps
unsurprisingly) aren't affected by CONFIG_INTEL_IDLE. In fact, turning off the
whole of the CPU idle infrastructure as below didn't have any effect: transfers
into the guest remained slow.

@@ -441,10 +441,8 @@ CONFIG_ACPI=y
 # CONFIG_ACPI_BUTTON is not set
 CONFIG_ACPI_FAN=y
 CONFIG_ACPI_DOCK=y
-CONFIG_ACPI_PROCESSOR=y
+# CONFIG_ACPI_PROCESSOR is not set
 CONFIG_ACPI_IPMI=y
-CONFIG_ACPI_PROCESSOR_AGGREGATOR=y
-CONFIG_ACPI_THERMAL=y
 CONFIG_ACPI_NUMA=y
 # CONFIG_ACPI_CUSTOM_DSDT is not set
 CONFIG_ACPI_BLACKLIST_YEAR=0
@@ -463,16 +461,12 @@ CONFIG_SFI=y
 # CPU Frequency scaling
 #
 # CONFIG_CPU_FREQ is not set
-CONFIG_CPU_IDLE=y
-CONFIG_CPU_IDLE_GOV_LADDER=y
-CONFIG_CPU_IDLE_GOV_MENU=y
-CONFIG_INTEL_IDLE=y
+# CONFIG_CPU_IDLE is not set
 
 #
 # Memory power savings
 #
-CONFIG_I7300_IDLE_IOAT_CHANNEL=y
-CONFIG_I7300_IDLE=y
+# CONFIG_I7300_IDLE is not set
 
 #
 # Bus options (PCI etc.)


(From earlier in the thread, the full unmodified kernel config for one of these
boxes is at

  http://cdw.me.uk/tmp/server-config.txt

and I've tested equivalent configs on kernels from 3.0.x to the head of
Linus' git tree.)

Is the CONFIG_INTEL_IDLE thing suggestive of what the problem might be at all?
Out of interest, what is the impact of disabling this on Intel machines?
Presumably in this case we rely on ACPI for processor idle?

Also, is there an equivalent for AMD machines which we could disable on our
large Opteron boxes, given that even getting rid of the whole CONFIG_CPU_IDLE
machinery didn't change anything as CONFIG_INTEL_IDLE did on the Intel box?

Best wishes,

Chris.



Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-16 Thread Michael S. Tsirkin
On Thu, Aug 16, 2012 at 03:27:57PM +0100, Chris Webb wrote:
> "Michael S. Tsirkin"  writes:
> 
> > On Thu, Aug 16, 2012 at 10:20:05AM +0100, Chris Webb wrote:
> >
> > > For example, I can run
> > > 
> > >   ip addr add 192.168.1.2/24 dev eth0
> > >   ip link set eth0 up
> > >   ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap 
> > > mode bridge
> > >   ip link set tap0 up
> > >   qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
> > > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
> > > -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex)
> > > 
> > > on one physical host which is otherwise completely idle. From a second
> > > physical host on the same network, I then scp a large (say 50MB) file onto
> > > the new guest. On a gigabit LAN, speeds consistently drop to less than
> > > 100kB/s as the transfer progresses, within a second of starting.
> 
> > Thanks for the report.
> > I'll try to reproduce this early next week.
> > Meanwhile a question - do you still observe this behaviour if you enable
> > vhost-net?
> 
> I haven't tried running with vhost-net before. Is it sufficient to compile
> the host kernel with CONFIG_VHOST_NET=y and boot the guest with
> 
>   qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
> -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
> -net tap,fd=3,vhost=on,vhostfd=4 \
> 3<>/dev/tap$(< /sys/class/net/tap0/ifindex) 4<>/dev/vhost-net
> 
> ? If so, then I'm afraid this doesn't make any difference: it still stalls
> and drops right down in speed.
> 
> The reason I'm hesitant about whether the vhost-net is actually working is
> that with both vhost=off and vhost=on, I see an identical virtio feature set
> within the guest:
> 
>   # cat /sys/bus/virtio/devices/virtio0/features 
>   01111000

Yes that is expected.

> However, without the 4<>/dev/vhost-net or with 4<>/dev/null, it seems to
> fail to start altogether with vhost=on,vhostfd=4, so perhaps it's fine?
> 
> Cheers,
> 
> Chris.



Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-16 Thread Chris Webb
"Michael S. Tsirkin"  writes:

> On Thu, Aug 16, 2012 at 10:20:05AM +0100, Chris Webb wrote:
>
> > For example, I can run
> > 
> >   ip addr add 192.168.1.2/24 dev eth0
> >   ip link set eth0 up
> >   ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap 
> > mode bridge
> >   ip link set tap0 up
> >   qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
> > -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
> > -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex)
> > 
> > on one physical host which is otherwise completely idle. From a second
> > physical host on the same network, I then scp a large (say 50MB) file onto
> > the new guest. On a gigabit LAN, speeds consistently drop to less than
> > 100kB/s as the transfer progresses, within a second of starting.

> Thanks for the report.
> I'll try to reproduce this early next week.
> Meanwhile a question - do you still observe this behaviour if you enable
> vhost-net?

I haven't tried running with vhost-net before. Is it sufficient to compile
the host kernel with CONFIG_VHOST_NET=y and boot the guest with

  qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
-net nic,model=virtio,macaddr=02:02:02:02:02:02 \
-net tap,fd=3,vhost=on,vhostfd=4 \
3<>/dev/tap$(< /sys/class/net/tap0/ifindex) 4<>/dev/vhost-net

? If so, then I'm afraid this doesn't make any difference: it still stalls
and drops right down in speed.

The reason I'm hesitant about whether the vhost-net is actually working is
that with both vhost=off and vhost=on, I see an identical virtio feature set
within the guest:

  # cat /sys/bus/virtio/devices/virtio0/features 
  01111000

However, without the 4<>/dev/vhost-net or with 4<>/dev/null, it seems to
fail to start altogether with vhost=on,vhostfd=4, so perhaps it's fine?

Cheers,

Chris.



Re: [Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-16 Thread Michael S. Tsirkin
On Thu, Aug 16, 2012 at 10:20:05AM +0100, Chris Webb wrote:
> I'm experiencing a problem with qemu + macvtap which I can reproduce on a
> variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to
> 3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1.
> 
> Large data transfers over TCP into a guest from another machine on the
> network are very slow (often less than 100kB/s) whereas transfers outbound
> from the guest, between two guests on the same host, or between the guest
> and its host run at normal speeds (>= 50MB/s).
> 
> The slow inbound data transfer speeds up substantially when a ping flood is
> aimed either at the host or the guest, or when the qemu process is straced.
> Presumably both of these are ways to wake up something that is otherwise
> sleeping too long?
> 
> For example, I can run
> 
>   ip addr add 192.168.1.2/24 dev eth0
>   ip link set eth0 up
>   ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap mode 
> bridge
>   ip link set tap0 up
>   qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
> -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
> -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex)
> 
> on one physical host which is otherwise completely idle. From a second
> physical host on the same network, I then scp a large (say 50MB) file onto
> the new guest. On a gigabit LAN, speeds consistently drop to less than
> 100kB/s as the transfer progresses, within a second of starting.
> 
> The choice of virtio virtual nic in the above isn't significant: the same 
> thing
> happens with e1000 or rtl8139. You can also replace the scp with a straight
> netcat and see the same effect.
> 
> Doing the transfer in the other direction (i.e. copying a large file from the
> guest to an external host) achieves 50MB/s or faster as expected. Copying
> between two guests on the same host (i.e. taking advantage of the 'mode
> bridge') is also fast.
> 
> If I create a macvlan device attached to eth0 and move the host IP address to
> that, I can communicate between the host itself and the guest because of the
> 'mode bridge'. Again, this case is fast in both directions.
> 
> Using a bridge and a standard tap interface, transfers in and out are fast
> too:
> 
>   ip tuntap add tap0 mode tap
>   brctl addbr br0
>   brctl addif br0 eth0
>   brctl addif br0 tap1
>   ip link set eth0 up
>   ip link set tap0 up
>   ip link set br0 up
>   ip addr add 192.168.1.2/24 dev br0
>   qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
> -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
> -net tap,script=no,downscript=no,ifname=tap0
> 
> As mentioned in the summary at the beginning of this report, when I strace a
> guest in the original configuration which is receiving data slowly, the data
> rate improves from less than 100kB/s to around 3.1MB/s. Similarly, if I ping
> flood either the guest or the host it is running on from another machine on
> the network, the transfer rate improves to around 1.1MB/s. This seems quite
> suggestive of a problem with delayed wake-up of the guest.
> 
> Two reasonably up-to-date examples of machines I've reproduced this on are
> my laptop with an r8169 gigabit ethernet card, Debian qemu-kvm 1.0 and
> upstream 3.4.8 kernel whose .config and boot dmesg are at
> 
>   http://cdw.me.uk/tmp/laptop-config.txt
>   http://cdw.me.uk/tmp/laptop-dmesg.txt
> 
> and one of our large servers with an igb gigabit ethernet card, upstream
> qemu-kvm 1.1.1 and upstream 3.5.1 linux:
> 
>   http://cdw.me.uk/tmp/server-config.txt
>   http://cdw.me.uk/tmp/server-dmesg.txt
> 
> For completeness, I've put the Debian 6 test image I've been using for
> testing at
> 
>   http://cdw.me.uk/tmp/test-debian.img.xz
> 
> though I've see the same problem from a variety of guest operating systems.
> (In fact, I've not yet found any combination of host kernel, guest OS and
> hardware which doesn't show these symptoms, so it seems to be very easy to
> reproduce.)
> 
> Cheers,
> 
> Chris.

Thanks for the report.
I'll try to reproduce this early next week.
Meanwhile a question - do you still observe this behaviour if you enable
vhost-net?

Thanks,

-- 
MST



[Qemu-devel] Slow inbound traffic on macvtap interfaces

2012-08-16 Thread Chris Webb
I'm experiencing a problem with qemu + macvtap which I can reproduce on a
variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to
3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1.

Large data transfers over TCP into a guest from another machine on the
network are very slow (often less than 100kB/s) whereas transfers outbound
from the guest, between two guests on the same host, or between the guest
and its host run at normal speeds (>= 50MB/s).

The slow inbound data transfer speeds up substantially when a ping flood is
aimed either at the host or the guest, or when the qemu process is straced.
Presumably both of these are ways to wake up something that is otherwise
sleeping too long?

For example, I can run

  ip addr add 192.168.1.2/24 dev eth0
  ip link set eth0 up
  ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap mode 
bridge
  ip link set tap0 up
  qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
-net nic,model=virtio,macaddr=02:02:02:02:02:02 \
-net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex)

on one physical host which is otherwise completely idle. From a second
physical host on the same network, I then scp a large (say 50MB) file onto
the new guest. On a gigabit LAN, speeds consistently drop to less than
100kB/s as the transfer progresses, within a second of starting.

The choice of virtio virtual nic in the above isn't significant: the same thing
happens with e1000 or rtl8139. You can also replace the scp with a straight
netcat and see the same effect.

Doing the transfer in the other direction (i.e. copying a large file from the
guest to an external host) achieves 50MB/s or faster as expected. Copying
between two guests on the same host (i.e. taking advantage of the 'mode
bridge') is also fast.

If I create a macvlan device attached to eth0 and move the host IP address to
that, I can communicate between the host itself and the guest because of the
'mode bridge'. Again, this case is fast in both directions.

Using a bridge and a standard tap interface, transfers in and out are fast
too:

  ip tuntap add tap0 mode tap
  brctl addbr br0
  brctl addif br0 eth0
  brctl addif br0 tap1
  ip link set eth0 up
  ip link set tap0 up
  ip link set br0 up
  ip addr add 192.168.1.2/24 dev br0
  qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
-net nic,model=virtio,macaddr=02:02:02:02:02:02 \
-net tap,script=no,downscript=no,ifname=tap0

As mentioned in the summary at the beginning of this report, when I strace a
guest in the original configuration which is receiving data slowly, the data
rate improves from less than 100kB/s to around 3.1MB/s. Similarly, if I ping
flood either the guest or the host it is running on from another machine on
the network, the transfer rate improves to around 1.1MB/s. This seems quite
suggestive of a problem with delayed wake-up of the guest.

Two reasonably up-to-date examples of machines I've reproduced this on are
my laptop with an r8169 gigabit ethernet card, Debian qemu-kvm 1.0 and
upstream 3.4.8 kernel whose .config and boot dmesg are at

  http://cdw.me.uk/tmp/laptop-config.txt
  http://cdw.me.uk/tmp/laptop-dmesg.txt

and one of our large servers with an igb gigabit ethernet card, upstream
qemu-kvm 1.1.1 and upstream 3.5.1 linux:

  http://cdw.me.uk/tmp/server-config.txt
  http://cdw.me.uk/tmp/server-dmesg.txt

For completeness, I've put the Debian 6 test image I've been using for
testing at

  http://cdw.me.uk/tmp/test-debian.img.xz

though I've see the same problem from a variety of guest operating systems.
(In fact, I've not yet found any combination of host kernel, guest OS and
hardware which doesn't show these symptoms, so it seems to be very easy to
reproduce.)

Cheers,

Chris.