Re: [Qemu-devel] RFC: handling backend too fast in virtio-net

2013-02-18 Thread Stefan Hajnoczi
On Fri, Feb 15, 2013 at 11:24:29AM +0100, Stefan Hajnoczi wrote:
 On Thu, Feb 14, 2013 at 07:21:57PM +0100, Luigi Rizzo wrote:
 
 CCed Michael Tsirkin
 
  virtio-style network devices (where the producer and consumer chase
  each other through a shared memory block) can enter into a
  bad operating regime when the consumer is too fast.
  
  I am hitting this case right now when virtio is used on top of the
  netmap/VALE backend that I posted a few weeks ago: what happens is that
  the backend is so fast that the io thread keeps re-enabling notifications
  every few packets.  This results, on my test machine, in a throughput of
  250-300Kpps (and extremely unstable, oscillating between 200 and 600Kpps).
  
  I'd like to get some feedback on the following trivial patch to have
  the thread keep spinning for a bounded amount of time when the producer
  is slower than the consumer. This gives a relatively stable throughput
  between 700 and 800 Kpps (we have something similar in our paravirtualized
  e1000 driver, which is slightly faster at 900-1100 Kpps).
 
 Did you experiment with tx timer instead of bh?  It seems that
 hw/virtio-net.c has two tx mitigation strategies - the bh approach that
 you've tweaked and a true timer.
 
 It seems you don't really want tx batching but you do want to avoid
 guest-host notifications?

One more thing I forgot: virtio-net does not use ioeventfd by default.
ioeventfd changes the cost of guest-host notifications because the
notification becomes an eventfd signal inside the kernel and kvm.ko then
re-enters the guest.

This means a guest-host notification becomes a light-weight exit and we
don't return from ioctl(KVM_RUN).

Perhaps -device virtio-blk-pci,ioeventfd=on will give similar results to
your patch?

Stefan



Re: [Qemu-devel] RFC: handling backend too fast in virtio-net

2013-02-18 Thread Luigi Rizzo
On Mon, Feb 18, 2013 at 1:50 AM, Stefan Hajnoczi stefa...@gmail.com wrote:

 On Fri, Feb 15, 2013 at 11:24:29AM +0100, Stefan Hajnoczi wrote:
  On Thu, Feb 14, 2013 at 07:21:57PM +0100, Luigi Rizzo wrote:
 
  CCed Michael Tsirkin
 
   virtio-style network devices (where the producer and consumer chase
   each other through a shared memory block) can enter into a
   bad operating regime when the consumer is too fast.
  
   I am hitting this case right now when virtio is used on top of the
   netmap/VALE backend that I posted a few weeks ago: what happens is that
   the backend is so fast that the io thread keeps re-enabling
 notifications
   every few packets.  This results, on my test machine, in a throughput
 of
   250-300Kpps (and extremely unstable, oscillating between 200 and
 600Kpps).
  
   I'd like to get some feedback on the following trivial patch to have
   the thread keep spinning for a bounded amount of time when the producer
   is slower than the consumer. This gives a relatively stable throughput
   between 700 and 800 Kpps (we have something similar in our
 paravirtualized
   e1000 driver, which is slightly faster at 900-1100 Kpps).
 
  Did you experiment with tx timer instead of bh?  It seems that
  hw/virtio-net.c has two tx mitigation strategies - the bh approach that
  you've tweaked and a true timer.
 
  It seems you don't really want tx batching but you do want to avoid
  guest-host notifications?

 One more thing I forgot: virtio-net does not use ioeventfd by default.
 ioeventfd changes the cost of guest-host notifications because the
 notification becomes an eventfd signal inside the kernel and kvm.ko then
 re-enters the guest.

 This means a guest-host notification becomes a light-weight exit and we
 don't return from ioctl(KVM_RUN).

 Perhaps -device virtio-blk-pci,ioeventfd=on will give similar results to
 your patch?


is the ioeventfd the mechanism used by vhostnet ?
If so, Giuseppe Lettieri (in Cc) has tried that with
a modified netmap backend and experienced the same
problem -- making the io thread (user or kernel)
spin a bit more has great benefit on the throughput.

cheers
luigi


Re: [Qemu-devel] RFC: handling backend too fast in virtio-net

2013-02-18 Thread Stefan Hajnoczi
On Mon, Feb 18, 2013 at 02:12:13AM -0800, Luigi Rizzo wrote:
 On Mon, Feb 18, 2013 at 1:50 AM, Stefan Hajnoczi stefa...@gmail.com wrote:
 
  On Fri, Feb 15, 2013 at 11:24:29AM +0100, Stefan Hajnoczi wrote:
   On Thu, Feb 14, 2013 at 07:21:57PM +0100, Luigi Rizzo wrote:
  
   CCed Michael Tsirkin
  
virtio-style network devices (where the producer and consumer chase
each other through a shared memory block) can enter into a
bad operating regime when the consumer is too fast.
   
I am hitting this case right now when virtio is used on top of the
netmap/VALE backend that I posted a few weeks ago: what happens is that
the backend is so fast that the io thread keeps re-enabling
  notifications
every few packets.  This results, on my test machine, in a throughput
  of
250-300Kpps (and extremely unstable, oscillating between 200 and
  600Kpps).
   
I'd like to get some feedback on the following trivial patch to have
the thread keep spinning for a bounded amount of time when the producer
is slower than the consumer. This gives a relatively stable throughput
between 700 and 800 Kpps (we have something similar in our
  paravirtualized
e1000 driver, which is slightly faster at 900-1100 Kpps).
  
   Did you experiment with tx timer instead of bh?  It seems that
   hw/virtio-net.c has two tx mitigation strategies - the bh approach that
   you've tweaked and a true timer.
  
   It seems you don't really want tx batching but you do want to avoid
   guest-host notifications?
 
  One more thing I forgot: virtio-net does not use ioeventfd by default.
  ioeventfd changes the cost of guest-host notifications because the
  notification becomes an eventfd signal inside the kernel and kvm.ko then
  re-enters the guest.
 
  This means a guest-host notification becomes a light-weight exit and we
  don't return from ioctl(KVM_RUN).
 
  Perhaps -device virtio-blk-pci,ioeventfd=on will give similar results to
  your patch?
 
 
 is the ioeventfd the mechanism used by vhostnet ?
 If so, Giuseppe Lettieri (in Cc) has tried that with
 a modified netmap backend and experienced the same
 problem -- making the io thread (user or kernel)
 spin a bit more has great benefit on the throughput.

Yes, ioeventfd is how the vhost_net kernel module is notified by the
guest.

It's easy to test, just use -device virtio-net-pci,ioeventfd=on,...

Stefan



Re: [Qemu-devel] RFC: handling backend too fast in virtio-net

2013-02-15 Thread Stefan Hajnoczi
On Thu, Feb 14, 2013 at 07:21:57PM +0100, Luigi Rizzo wrote:

CCed Michael Tsirkin

 virtio-style network devices (where the producer and consumer chase
 each other through a shared memory block) can enter into a
 bad operating regime when the consumer is too fast.
 
 I am hitting this case right now when virtio is used on top of the
 netmap/VALE backend that I posted a few weeks ago: what happens is that
 the backend is so fast that the io thread keeps re-enabling notifications
 every few packets.  This results, on my test machine, in a throughput of
 250-300Kpps (and extremely unstable, oscillating between 200 and 600Kpps).
 
 I'd like to get some feedback on the following trivial patch to have
 the thread keep spinning for a bounded amount of time when the producer
 is slower than the consumer. This gives a relatively stable throughput
 between 700 and 800 Kpps (we have something similar in our paravirtualized
 e1000 driver, which is slightly faster at 900-1100 Kpps).

Did you experiment with tx timer instead of bh?  It seems that
hw/virtio-net.c has two tx mitigation strategies - the bh approach that
you've tweaked and a true timer.

It seems you don't really want tx batching but you do want to avoid
guest-host notifications?

 EXISTING LOGIC: reschedule the bh when at least a burst of packets has
been received. Otherwise enable notifications and do another
race-prevention check.
 
 NEW LOGIC: when the bh is scheduled the first time, establish a
budget (currently 20) of reschedule attempts.
Every time virtio_net_flush_tx() returns 0 packets [this could 
be changed to 'less than a full burst'] the counter is increased.
The bh is rescheduled until the counter reaches the budget,
at which point we re-enable notifications as above.
 
 I am not 100% happy with this patch because there is a magic
 constant (the maximum number of retries) which should be really
 adaptive, or perhaps we should even bound the amount of time the
 bh runs, rather than the number of retries.
 In practice, having the thread spin (or sleep) for 10-20us 
 even without new packets is probably preferable to 
 re-enable notifications and request a kick.
 
 opinions ?
 
 cheers
 luigi
 
 diff --git a/hw/virtio-net.c b/hw/virtio-net.c
 index 573c669..5389088 100644
 --- a/hw/virtio-net.c
 +++ b/hw/virtio-net.c
 @@ -49,6 +51,7 @@ typedef struct VirtIONet
  NICState *nic;
  uint32_t tx_timeout;
  int32_t tx_burst;
 +int32_t tx_retries; /* keep spinning a bit on tx */
  uint32_t has_vnet_hdr;
  size_t host_hdr_len;
  size_t guest_hdr_len;
 @@ -1062,7 +1065,9 @@ static void virtio_net_tx_bh(void *opaque)
 
  /* If we flush a full burst of packets, assume there are
   * more coming and immediately reschedule */
 -if (ret = n-tx_burst) {
 +if (ret == 0)
 +   n-tx_retries++;
 +if (n-tx_retries  20) {
  qemu_bh_schedule(q-tx_bh);
  q-tx_waiting = 1;
  return;
 @@ -1076,6 +1082,8 @@ static void virtio_net_tx_bh(void *opaque)
  virtio_queue_set_notification(q-tx_vq, 0);
  qemu_bh_schedule(q-tx_bh);
  q-tx_waiting = 1;
 +} else {
 +   n-tx_retries = 0;
  }
  }
 
 
 



Re: [Qemu-devel] RFC: handling backend too fast in virtio-net

2013-02-15 Thread Luigi Rizzo
On Fri, Feb 15, 2013 at 11:24:29AM +0100, Stefan Hajnoczi wrote:
 On Thu, Feb 14, 2013 at 07:21:57PM +0100, Luigi Rizzo wrote:
 
 CCed Michael Tsirkin
 
  virtio-style network devices (where the producer and consumer chase
  each other through a shared memory block) can enter into a
  bad operating regime when the consumer is too fast.
  
  I am hitting this case right now when virtio is used on top of the
  netmap/VALE backend that I posted a few weeks ago: what happens is that
  the backend is so fast that the io thread keeps re-enabling notifications
  every few packets.  This results, on my test machine, in a throughput of
  250-300Kpps (and extremely unstable, oscillating between 200 and 600Kpps).
  
  I'd like to get some feedback on the following trivial patch to have
  the thread keep spinning for a bounded amount of time when the producer
  is slower than the consumer. This gives a relatively stable throughput
  between 700 and 800 Kpps (we have something similar in our paravirtualized
  e1000 driver, which is slightly faster at 900-1100 Kpps).
 
 Did you experiment with tx timer instead of bh?  It seems that
 hw/virtio-net.c has two tx mitigation strategies - the bh approach that
 you've tweaked and a true timer.
 
 It seems you don't really want tx batching but you do want to avoid
 guest-host notifications?

i have just tried:
bh (with my tweaks): 700-800Kpps (large variance)
timer (150us, 256 slots):  ~490Kpps (very stable)

As expected The throughput in the timer version seems proportional
to ring_size/timeout , e.g. 1ms and 256slots
give approx 210Kpps, 1ms and 128 slots give 108Kpps.
but it saturates around 490Kpps.

cheers
luigi

  EXISTING LOGIC: reschedule the bh when at least a burst of packets has
 been received. Otherwise enable notifications and do another
 race-prevention check.
  
  NEW LOGIC: when the bh is scheduled the first time, establish a
 budget (currently 20) of reschedule attempts.
 Every time virtio_net_flush_tx() returns 0 packets [this could 
 be changed to 'less than a full burst'] the counter is increased.
 The bh is rescheduled until the counter reaches the budget,
 at which point we re-enable notifications as above.
  
  I am not 100% happy with this patch because there is a magic
  constant (the maximum number of retries) which should be really
  adaptive, or perhaps we should even bound the amount of time the
  bh runs, rather than the number of retries.
  In practice, having the thread spin (or sleep) for 10-20us 
  even without new packets is probably preferable to 
  re-enable notifications and request a kick.
  
  opinions ?
  
  cheers
  luigi
  
  diff --git a/hw/virtio-net.c b/hw/virtio-net.c
  index 573c669..5389088 100644
  --- a/hw/virtio-net.c
  +++ b/hw/virtio-net.c
  @@ -49,6 +51,7 @@ typedef struct VirtIONet
   NICState *nic;
   uint32_t tx_timeout;
   int32_t tx_burst;
  +int32_t tx_retries; /* keep spinning a bit on tx */
   uint32_t has_vnet_hdr;
   size_t host_hdr_len;
   size_t guest_hdr_len;
  @@ -1062,7 +1065,9 @@ static void virtio_net_tx_bh(void *opaque)
  
   /* If we flush a full burst of packets, assume there are
* more coming and immediately reschedule */
  -if (ret = n-tx_burst) {
  +if (ret == 0)
  +   n-tx_retries++;
  +if (n-tx_retries  20) {
   qemu_bh_schedule(q-tx_bh);
   q-tx_waiting = 1;
   return;
  @@ -1076,6 +1082,8 @@ static void virtio_net_tx_bh(void *opaque)
   virtio_queue_set_notification(q-tx_vq, 0);
   qemu_bh_schedule(q-tx_bh);
   q-tx_waiting = 1;
  +} else {
  +   n-tx_retries = 0;
   }
   }