Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-30 Thread Rusty Russell
On Sat, 28 May 2011 23:02:04 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 On Thu, May 26, 2011 at 12:58:23PM +0930, Rusty Russell wrote:
  ie. free two packets for every one we're about to add.  For steady state
  that would work really well.
 
 Sure, with indirect buffers, but if we
 don't use indirect (and we discussed switching indirect off
 dynamically in the past) this becomes harder to
 be sure about. I think I understand why but
 does not a simple capacity check make it more obvious?

...

   Then we hit the case where the ring
  seems full after we do the add: at that point, screw latency, and just
  try to free all the buffers we can.
 
 I see. But the code currently does this:
 
   for(..)
   get_buf
   add_buf
   if (capacity  max_sk_frags+2) {
   if (!enable_cb)
   for(..)
   get_buf
   }
 
 
 In other words the second get_buf is only called
 in the unlikely case of race condition.
 
 So we'll need to add *another* call to get_buf.
 Is it just me or is this becoming messy?

Yes, good point.  I really wonder if anyone would be able to measure the
difference between simply freeing 2 every time (with possible extra
stalls for strange cases) and the more complete version.

But it runs against my grain to implement heuristics when one more call
would make it provably reliable.

Please find a way to make that for loop less ugly though!

Thanks,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-27 Thread Rusty Russell
On Wed, 25 May 2011 09:07:59 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 On Wed, May 25, 2011 at 11:05:04AM +0930, Rusty Russell wrote:
 Hmm I'm not sure I got it, need to think about this.
 I'd like to go back and document how my design was supposed to work.
 This really should have been in commit log or even a comment.
 I thought we need a min, not a max.
 We start with this:
 
   while ((c = (virtqueue_get_capacity(vq)  2 + MAX_SKB_FRAGS) 
   (skb = get_buf)))
   kfree_skb(skb);
   return !c;
 
 This is clean and simple, right? And it's exactly asking for what we need.

No, I started from the other direction:

for (i = 0; i  2; i++) {
skb = get_buf();
if (!skb)
break;
kfree_skb(skb);
}

ie. free two packets for every one we're about to add.  For steady state
that would work really well.  Then we hit the case where the ring seems
full after we do the add: at that point, screw latency, and just try to
free all the buffers we can.

 on the normal path min == 2 so we're low latency but we keep ahead on
 average. min == 0 for the we're out of capacity, we may have to stop
 the queue.
 
 Does the above make sense at all?

It makes sense, but I think it's a classic case where incremental
improvements aren't as good as starting from scratch.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-25 Thread Michael S. Tsirkin
On Wed, May 25, 2011 at 11:05:04AM +0930, Rusty Russell wrote:
 On Mon, 23 May 2011 14:19:00 +0300, Michael S. Tsirkin m...@redhat.com 
 wrote:
  I do understand how it seems a waste to leave direct space
  in the ring while we might in practice have space
  due to indirect. Didn't come up with a nice way to
  solve this yet - but 'no worse than now :)'
 
 Let's just make it bool free_old_xmit_skbs(unsigned int max).  max ==
 2 for the normal xmit path, so we're low latency but we keep ahead on
 average.  max == -1 for the we're out of capacity, we may have to stop
 the queue.
 
 That keeps it simple and probably the right thing...
 
 Thanks,
 Rusty.

Hmm I'm not sure I got it, need to think about this.
I'd like to go back and document how my design was supposed to work.
This really should have been in commit log or even a comment.
I thought we need a min, not a max.
We start with this:

while ((c = (virtqueue_get_capacity(vq)  2 + MAX_SKB_FRAGS) 
(skb = get_buf)))
kfree_skb(skb);
return !c;

This is clean and simple, right? And it's exactly asking for what we need.

But this way we always keep a lot of memory in skbs even when rate of
communication is low.

So we add the min parameter:

int n = 0;

while c = (virtqueue_get_capacity(vq)  2 + MAX_SKB_FRAGS)) ||
 n++  min)  (skb = get_buf)))
kfree_skb(skb);
return !c;


on the normal path min == 2 so we're low latency but we keep ahead on
average. min == 0 for the we're out of capacity, we may have to stop
the queue.

Does the above make sense at all?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Krishna Kumar2
Michael S. Tsirkin m...@redhat.com wrote on 05/23/2011 04:49:00 PM:

  To do this properly, we should really be using the actual number of sg
  elements needed, but we'd have to do most of xmit_skb beforehand so we
  know how many.
 
  Cheers,
  Rusty.

 Maybe I'm confused here.  The problem isn't the failing
 add_buf for the given skb IIUC.  What we are trying to do here is stop
 the queue *before xmit_skb fails*. We can't look at the
 number of fragments in the current skb - the next one can be
 much larger.  That's why we check capacity after xmit_skb,
 not before it, right?

Maybe Rusty means it is a simpler model to free the amount
of space that this xmit needs. We will still fail anyway
at some time but it is unlikely, since earlier iteration
freed up atleast the space that it was going to use. The
code could become much simpler:

start_xmit()
{
{
num_sgs = get num_sgs for this skb;

/* Free enough pending old buffers to enable queueing this one */
free_old_xmit_skbs(vi, num_sgs * 2); /* ?? */

if (virtqueue_get_capacity()  num_sgs) {
netif_stop_queue(dev);
if (virtqueue_enable_cb_delayed(vi-svq) ||
free_old_xmit_skbs(vi, num_sgs)) {
/* Nothing freed up, or not enough freed up */
kfree_skb(skb);
return NETDEV_TX_OK;
}
netif_start_queue(dev);
virtqueue_disable_cb(vi-svq);
}

/* xmit_skb cannot fail now, also pass 'num_sgs' */
xmit_skb(vi, skb, num_sgs);
virtqueue_kick(vi-svq);

skb_orphan(skb);
nf_reset(skb);

return NETDEV_TX_OK;
}

We could even return TX_BUSY since that makes the dequeue
code more efficient. See dev_dequeue_skb() - you can skip a
lot of code (and avoid taking locks) to check if the queue
is already stopped but that code runs only if you return
TX_BUSY in the earlier iteration.

BTW, shouldn't the check in start_xmit be:
if (likely(!free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
...
}

Thanks,

- KK

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Michael S. Tsirkin
On Tue, May 24, 2011 at 01:24:15PM +0530, Krishna Kumar2 wrote:
 Michael S. Tsirkin m...@redhat.com wrote on 05/23/2011 04:49:00 PM:
 
   To do this properly, we should really be using the actual number of sg
   elements needed, but we'd have to do most of xmit_skb beforehand so we
   know how many.
  
   Cheers,
   Rusty.
 
  Maybe I'm confused here.  The problem isn't the failing
  add_buf for the given skb IIUC.  What we are trying to do here is stop
  the queue *before xmit_skb fails*. We can't look at the
  number of fragments in the current skb - the next one can be
  much larger.  That's why we check capacity after xmit_skb,
  not before it, right?
 
 Maybe Rusty means it is a simpler model to free the amount
 of space that this xmit needs. We will still fail anyway
 at some time but it is unlikely, since earlier iteration
 freed up atleast the space that it was going to use.

Not sure I nderstand.  We can't know space is freed in the previous
iteration as buffers might not have been used by then.

 The
 code could become much simpler:
 
 start_xmit()
 {
 {
 num_sgs = get num_sgs for this skb;
 
 /* Free enough pending old buffers to enable queueing this one */
 free_old_xmit_skbs(vi, num_sgs * 2); /* ?? */
 
 if (virtqueue_get_capacity()  num_sgs) {
 netif_stop_queue(dev);
 if (virtqueue_enable_cb_delayed(vi-svq) ||
 free_old_xmit_skbs(vi, num_sgs)) {
 /* Nothing freed up, or not enough freed up */
 kfree_skb(skb);
 return NETDEV_TX_OK;

This packet drop is what we wanted to avoid.


 }
 netif_start_queue(dev);
 virtqueue_disable_cb(vi-svq);
 }
 
 /* xmit_skb cannot fail now, also pass 'num_sgs' */
 xmit_skb(vi, skb, num_sgs);
 virtqueue_kick(vi-svq);
 
 skb_orphan(skb);
 nf_reset(skb);
 
 return NETDEV_TX_OK;
 }
 
 We could even return TX_BUSY since that makes the dequeue
 code more efficient. See dev_dequeue_skb() - you can skip a
 lot of code (and avoid taking locks) to check if the queue
 is already stopped but that code runs only if you return
 TX_BUSY in the earlier iteration.
 
 BTW, shouldn't the check in start_xmit be:
   if (likely(!free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
   ...
   }
 
 Thanks,
 
 - KK

I thought we used to do basically this but other devices moved to a
model where they stop *before* queueing fails, so we did too.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Krishna Kumar2
Michael S. Tsirkin m...@redhat.com wrote on 05/24/2011 02:42:55 PM:

To do this properly, we should really be using the actual number of
sg
elements needed, but we'd have to do most of xmit_skb beforehand so
we
know how many.
   
Cheers,
Rusty.
  
   Maybe I'm confused here.  The problem isn't the failing
   add_buf for the given skb IIUC.  What we are trying to do here is
stop
   the queue *before xmit_skb fails*. We can't look at the
   number of fragments in the current skb - the next one can be
   much larger.  That's why we check capacity after xmit_skb,
   not before it, right?
 
  Maybe Rusty means it is a simpler model to free the amount
  of space that this xmit needs. We will still fail anyway
  at some time but it is unlikely, since earlier iteration
  freed up atleast the space that it was going to use.

 Not sure I nderstand.  We can't know space is freed in the previous
 iteration as buffers might not have been used by then.

Yes, the first few iterations may not have freed up space, but
later ones should. The amount of free space should increase
from then on, especially since we try to free double of what
we consume.

  The
  code could become much simpler:
 
  start_xmit()
  {
  {
  num_sgs = get num_sgs for this skb;
 
  /* Free enough pending old buffers to enable queueing this one
*/
  free_old_xmit_skbs(vi, num_sgs * 2); /* ?? */
 
  if (virtqueue_get_capacity()  num_sgs) {
  netif_stop_queue(dev);
  if (virtqueue_enable_cb_delayed(vi-svq) ||
  free_old_xmit_skbs(vi, num_sgs)) {
  /* Nothing freed up, or not enough freed up */
  kfree_skb(skb);
  return NETDEV_TX_OK;

 This packet drop is what we wanted to avoid.

Please see below on returning NETDEV_TX_BUSY.


  }
  netif_start_queue(dev);
  virtqueue_disable_cb(vi-svq);
  }
 
  /* xmit_skb cannot fail now, also pass 'num_sgs' */
  xmit_skb(vi, skb, num_sgs);
  virtqueue_kick(vi-svq);
 
  skb_orphan(skb);
  nf_reset(skb);
 
  return NETDEV_TX_OK;
  }
 
  We could even return TX_BUSY since that makes the dequeue
  code more efficient. See dev_dequeue_skb() - you can skip a
  lot of code (and avoid taking locks) to check if the queue
  is already stopped but that code runs only if you return
  TX_BUSY in the earlier iteration.
 
  BTW, shouldn't the check in start_xmit be:
 if (likely(!free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
...
 }
 
  Thanks,
 
  - KK

 I thought we used to do basically this but other devices moved to a
 model where they stop *before* queueing fails, so we did too.

I am not sure of why it was changed, since returning TX_BUSY
seems more efficient IMHO. qdisc_restart() handles requeue'd
packets much better than a stopped queue, as a significant
part of this code is skipped if gso_skb is present (qdisc
will eventually start dropping packets when tx_queue_len is
exceeded anyway).

Thanks,

- KK

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Michael S. Tsirkin
On Tue, May 24, 2011 at 02:57:43PM +0530, Krishna Kumar2 wrote:
 Michael S. Tsirkin m...@redhat.com wrote on 05/24/2011 02:42:55 PM:
 
 To do this properly, we should really be using the actual number of
 sg
 elements needed, but we'd have to do most of xmit_skb beforehand so
 we
 know how many.

 Cheers,
 Rusty.
   
Maybe I'm confused here.  The problem isn't the failing
add_buf for the given skb IIUC.  What we are trying to do here is
 stop
the queue *before xmit_skb fails*. We can't look at the
number of fragments in the current skb - the next one can be
much larger.  That's why we check capacity after xmit_skb,
not before it, right?
  
   Maybe Rusty means it is a simpler model to free the amount
   of space that this xmit needs. We will still fail anyway
   at some time but it is unlikely, since earlier iteration
   freed up atleast the space that it was going to use.
 
  Not sure I nderstand.  We can't know space is freed in the previous
  iteration as buffers might not have been used by then.
 
 Yes, the first few iterations may not have freed up space, but
 later ones should. The amount of free space should increase
 from then on, especially since we try to free double of what
 we consume.

Hmm. This is only an upper limit on the # of entries in the queue.
Assume that vq size is 4 and we transmit 4 enties without
getting anything in the used ring. The next transmit will fail.

So I don't really see why it's unlikely that we reach the packet
drop code with your patch.

   The
   code could become much simpler:
  
   start_xmit()
   {
   {
   num_sgs = get num_sgs for this skb;
  
   /* Free enough pending old buffers to enable queueing this one
 */
   free_old_xmit_skbs(vi, num_sgs * 2); /* ?? */
  
   if (virtqueue_get_capacity()  num_sgs) {
   netif_stop_queue(dev);
   if (virtqueue_enable_cb_delayed(vi-svq) ||
   free_old_xmit_skbs(vi, num_sgs)) {
   /* Nothing freed up, or not enough freed up */
   kfree_skb(skb);
   return NETDEV_TX_OK;
 
  This packet drop is what we wanted to avoid.
 
 Please see below on returning NETDEV_TX_BUSY.
 
 
   }
   netif_start_queue(dev);
   virtqueue_disable_cb(vi-svq);
   }
  
   /* xmit_skb cannot fail now, also pass 'num_sgs' */
   xmit_skb(vi, skb, num_sgs);
   virtqueue_kick(vi-svq);
  
   skb_orphan(skb);
   nf_reset(skb);
  
   return NETDEV_TX_OK;
   }
  
   We could even return TX_BUSY since that makes the dequeue
   code more efficient. See dev_dequeue_skb() - you can skip a
   lot of code (and avoid taking locks) to check if the queue
   is already stopped but that code runs only if you return
   TX_BUSY in the earlier iteration.
  
   BTW, shouldn't the check in start_xmit be:
  if (likely(!free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
 ...
  }
  
   Thanks,
  
   - KK
 
  I thought we used to do basically this but other devices moved to a
  model where they stop *before* queueing fails, so we did too.
 
 I am not sure of why it was changed, since returning TX_BUSY
 seems more efficient IMHO.
 qdisc_restart() handles requeue'd
 packets much better than a stopped queue, as a significant
 part of this code is skipped if gso_skb is present

I think this is the argument:
http://www.mail-archive.com/virtualization@lists.linux-foundation.org/msg06364.html


 (qdisc
 will eventually start dropping packets when tx_queue_len is
 exceeded anyway).
 
 Thanks,
 
 - KK

tx_queue_len is a pretty large buffer so maybe no.
I think the packet drops from the scheduler queue can also be
done intelligently (e.g. with CHOKe) which should
work better than dropping a random packet?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Krishna Kumar2
Michael S. Tsirkin m...@redhat.com wrote on 05/24/2011 04:59:39 PM:

Maybe Rusty means it is a simpler model to free the amount
of space that this xmit needs. We will still fail anyway
at some time but it is unlikely, since earlier iteration
freed up atleast the space that it was going to use.
  
   Not sure I nderstand.  We can't know space is freed in the previous
   iteration as buffers might not have been used by then.
 
  Yes, the first few iterations may not have freed up space, but
  later ones should. The amount of free space should increase
  from then on, especially since we try to free double of what
  we consume.

 Hmm. This is only an upper limit on the # of entries in the queue.
 Assume that vq size is 4 and we transmit 4 enties without
 getting anything in the used ring. The next transmit will fail.

 So I don't really see why it's unlikely that we reach the packet
 drop code with your patch.

I was assuming 256 entries :) I will try to get some
numbers to see how often it is true tomorrow.

  I am not sure of why it was changed, since returning TX_BUSY
  seems more efficient IMHO.
  qdisc_restart() handles requeue'd
  packets much better than a stopped queue, as a significant
  part of this code is skipped if gso_skb is present

 I think this is the argument:
 http://www.mail-archive.com/virtualization@lists.linux-
 foundation.org/msg06364.html

Thanks for digging up that thread! Yes, that one skb would get
sent first ahead of possibly higher priority skbs. However,
from a performance point, TX_BUSY code skips a lot of checks
and code for all subsequent packets till the device is
restarted. I can test performance with both cases and report
what I find (the requeue code has become very simple and clean
from horribly complex, thanks to Herbert and Dave).

  (qdisc
  will eventually start dropping packets when tx_queue_len is

 tx_queue_len is a pretty large buffer so maybe no.

I remember seeing tons of drops (pfifo_fast_enqueue) when
xmit returns TX_BUSY.

 I think the packet drops from the scheduler queue can also be
 done intelligently (e.g. with CHOKe) which should
 work better than dropping a random packet?

I am not sure of that - choke_enqueue checks against a random
skb to drop current skb, and also during congestion. But for
my sample driver xmit, returning TX_BUSY could still allow
to be used with CHOKe.

thanks,

- KK

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Michael S. Tsirkin
On Tue, May 24, 2011 at 06:20:35PM +0530, Krishna Kumar2 wrote:
 Michael S. Tsirkin m...@redhat.com wrote on 05/24/2011 04:59:39 PM:
 
 Maybe Rusty means it is a simpler model to free the amount
 of space that this xmit needs. We will still fail anyway
 at some time but it is unlikely, since earlier iteration
 freed up atleast the space that it was going to use.
   
Not sure I nderstand.  We can't know space is freed in the previous
iteration as buffers might not have been used by then.
  
   Yes, the first few iterations may not have freed up space, but
   later ones should. The amount of free space should increase
   from then on, especially since we try to free double of what
   we consume.
 
  Hmm. This is only an upper limit on the # of entries in the queue.
  Assume that vq size is 4 and we transmit 4 enties without
  getting anything in the used ring. The next transmit will fail.
 
  So I don't really see why it's unlikely that we reach the packet
  drop code with your patch.
 
 I was assuming 256 entries :) I will try to get some
 numbers to see how often it is true tomorrow.

That would depend on how fast the hypervisor is.
Try doing something to make hypervisor slower than the guest.  I don't
think we need measurements to realize that with the host being slower
than guest that would happen a lot, though.

   I am not sure of why it was changed, since returning TX_BUSY
   seems more efficient IMHO.
   qdisc_restart() handles requeue'd
   packets much better than a stopped queue, as a significant
   part of this code is skipped if gso_skb is present
 
  I think this is the argument:
  http://www.mail-archive.com/virtualization@lists.linux-
  foundation.org/msg06364.html
 
 Thanks for digging up that thread! Yes, that one skb would get
 sent first ahead of possibly higher priority skbs. However,
 from a performance point, TX_BUSY code skips a lot of checks
 and code for all subsequent packets till the device is
 restarted. I can test performance with both cases and report
 what I find (the requeue code has become very simple and clean
 from horribly complex, thanks to Herbert and Dave).

Cc Herbert, and try to convince him :)

   (qdisc
   will eventually start dropping packets when tx_queue_len is
 
  tx_queue_len is a pretty large buffer so maybe no.
 
 I remember seeing tons of drops (pfifo_fast_enqueue) when
 xmit returns TX_BUSY.
 
  I think the packet drops from the scheduler queue can also be
  done intelligently (e.g. with CHOKe) which should
  work better than dropping a random packet?
 
 I am not sure of that - choke_enqueue checks against a random
 skb to drop current skb, and also during congestion. But for
 my sample driver xmit, returning TX_BUSY could still allow
 to be used with CHOKe.
 
 thanks,
 
 - KK
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Rusty Russell
On Mon, 23 May 2011 14:19:00 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 On Mon, May 23, 2011 at 11:37:15AM +0930, Rusty Russell wrote:
  Can we hit problems with OOM?  Sure, but no worse than now...
  The problem is that this virtqueue_get_capacity() returns the worst
  case, not the normal case.  So using it is deceptive.
  
 
 Maybe just document this?

Yes, but also by renaming virtqueue_get_capacity().  Takes it from a 3
to a 6 on the API hard-to-misuse scale.

How about, virtqueue_min_capacity()?  Makes the reader realize something
weird is going on.

 I still believe capacity really needs to be decided
 at the virtqueue level, not in the driver.
 E.g. with indirect each skb uses a single entry: freeing
 1 small skb is always enough to have space for a large one.
 
 I do understand how it seems a waste to leave direct space
 in the ring while we might in practice have space
 due to indirect. Didn't come up with a nice way to
 solve this yet - but 'no worse than now :)'

Agreed.

   I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
   sure we have enough space in the buffer. Another way to do
   that is with a define :).
  
  To do this properly, we should really be using the actual number of sg
  elements needed, but we'd have to do most of xmit_skb beforehand so we
  know how many.
  
  Cheers,
  Rusty.
 
 Maybe I'm confused here.  The problem isn't the failing
 add_buf for the given skb IIUC.  What we are trying to do here is stop
 the queue *before xmit_skb fails*. We can't look at the
 number of fragments in the current skb - the next one can be
 much larger.  That's why we check capacity after xmit_skb,
 not before it, right?

No, I was confused...  More coffee!

Thanks,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Rusty Russell
On Mon, 23 May 2011 14:19:00 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 I do understand how it seems a waste to leave direct space
 in the ring while we might in practice have space
 due to indirect. Didn't come up with a nice way to
 solve this yet - but 'no worse than now :)'

Let's just make it bool free_old_xmit_skbs(unsigned int max).  max ==
2 for the normal xmit path, so we're low latency but we keep ahead on
average.  max == -1 for the we're out of capacity, we may have to stop
the queue.

That keeps it simple and probably the right thing...

Thanks,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-24 Thread Michael S. Tsirkin
On Wed, May 25, 2011 at 10:58:26AM +0930, Rusty Russell wrote:
 On Mon, 23 May 2011 14:19:00 +0300, Michael S. Tsirkin m...@redhat.com 
 wrote:
  On Mon, May 23, 2011 at 11:37:15AM +0930, Rusty Russell wrote:
   Can we hit problems with OOM?  Sure, but no worse than now...
   The problem is that this virtqueue_get_capacity() returns the worst
   case, not the normal case.  So using it is deceptive.
   
  
  Maybe just document this?
 
 Yes, but also by renaming virtqueue_get_capacity().  Takes it from a 3
 to a 6 on the API hard-to-misuse scale.
 
 How about, virtqueue_min_capacity()?  Makes the reader realize something
 weird is going on.

Absolutely. Great idea.

  I still believe capacity really needs to be decided
  at the virtqueue level, not in the driver.
  E.g. with indirect each skb uses a single entry: freeing
  1 small skb is always enough to have space for a large one.
  
  I do understand how it seems a waste to leave direct space
  in the ring while we might in practice have space
  due to indirect. Didn't come up with a nice way to
  solve this yet - but 'no worse than now :)'
 
 Agreed.
 
I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
sure we have enough space in the buffer. Another way to do
that is with a define :).
   
   To do this properly, we should really be using the actual number of sg
   elements needed, but we'd have to do most of xmit_skb beforehand so we
   know how many.
   
   Cheers,
   Rusty.
  
  Maybe I'm confused here.  The problem isn't the failing
  add_buf for the given skb IIUC.  What we are trying to do here is stop
  the queue *before xmit_skb fails*. We can't look at the
  number of fragments in the current skb - the next one can be
  much larger.  That's why we check capacity after xmit_skb,
  not before it, right?
 
 No, I was confused...  More coffee!
 
 Thanks,
 Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-23 Thread Michael S. Tsirkin
On Mon, May 23, 2011 at 11:37:15AM +0930, Rusty Russell wrote:
 On Sun, 22 May 2011 15:10:08 +0300, Michael S. Tsirkin m...@redhat.com 
 wrote:
  On Sat, May 21, 2011 at 11:49:59AM +0930, Rusty Russell wrote:
   On Fri, 20 May 2011 02:11:56 +0300, Michael S. Tsirkin 
   m...@redhat.com wrote:
Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.
   
   Do we have more than speculation to back that up, BTW?
  
  Need to dig this up: I thought we saw some reports of this on the list?
 
 I think so too, but a reference needs to be here too.
 
 It helps to have exact benchmarks on what's being tested, otherwise we
 risk unexpected interaction with the other optimization patches.
 
struct sk_buff *skb;
unsigned int len;
-
-   while ((skb = virtqueue_get_buf(vi-svq, len)) != NULL) {
+   bool c;
+   int n;
+
+   /* We try to free up at least 2 skbs per one sent, so that 
we'll get
+* all of the memory back if they are used fast enough. */
+   for (n = 0;
+((c = virtqueue_get_capacity(vi-svq)  capacity) || n  
2) 
+((skb = virtqueue_get_buf(vi-svq, len)));
+++n) {
pr_debug(Sent skb %p\n, skb);
vi-dev-stats.tx_bytes += skb-len;
vi-dev-stats.tx_packets++;
dev_kfree_skb_any(skb);
}
+   return !c;
   
   This is for() abuse :)
   
   Why is the capacity check in there at all?  Surely it's simpler to try
   to free 2 skbs each time around?
  
  This is in case we can't use indirect: we want to free up
  enough buffers for the following add_buf to succeed.
 
 Sure, or we could just count the frags of the skb we're taking out,
 which would be accurate for both cases and far more intuitive.
 
 ie. always try to free up twice as much as we're about to put in.
 
 Can we hit problems with OOM?  Sure, but no worse than now...
 The problem is that this virtqueue_get_capacity() returns the worst
 case, not the normal case.  So using it is deceptive.
 

Maybe just document this?

I still believe capacity really needs to be decided
at the virtqueue level, not in the driver.
E.g. with indirect each skb uses a single entry: freeing
1 small skb is always enough to have space for a large one.

I do understand how it seems a waste to leave direct space
in the ring while we might in practice have space
due to indirect. Didn't come up with a nice way to
solve this yet - but 'no worse than now :)'

  I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
  sure we have enough space in the buffer. Another way to do
  that is with a define :).
 
 To do this properly, we should really be using the actual number of sg
 elements needed, but we'd have to do most of xmit_skb beforehand so we
 know how many.
 
 Cheers,
 Rusty.

Maybe I'm confused here.  The problem isn't the failing
add_buf for the given skb IIUC.  What we are trying to do here is stop
the queue *before xmit_skb fails*. We can't look at the
number of fragments in the current skb - the next one can be
much larger.  That's why we check capacity after xmit_skb,
not before it, right?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-22 Thread Michael S. Tsirkin
On Sat, May 21, 2011 at 11:49:59AM +0930, Rusty Russell wrote:
 On Fri, 20 May 2011 02:11:56 +0300, Michael S. Tsirkin m...@redhat.com 
 wrote:
  Current code might introduce a lot of latency variation
  if there are many pending bufs at the time we
  attempt to transmit a new one. This is bad for
  real-time applications and can't be good for TCP either.
 
 Do we have more than speculation to back that up, BTW?

Need to dig this up: I thought we saw some reports of this on the list?

 This patch is pretty sloppy; the previous ones were better polished.
 
  -static void free_old_xmit_skbs(struct virtnet_info *vi)
  +static bool free_old_xmit_skbs(struct virtnet_info *vi, int capacity)
   {
 
 A comment here indicating it returns true if it frees something?

Agree.

  struct sk_buff *skb;
  unsigned int len;
  -
  -   while ((skb = virtqueue_get_buf(vi-svq, len)) != NULL) {
  +   bool c;
  +   int n;
  +
  +   /* We try to free up at least 2 skbs per one sent, so that we'll get
  +* all of the memory back if they are used fast enough. */
  +   for (n = 0;
  +((c = virtqueue_get_capacity(vi-svq)  capacity) || n  2) 
  +((skb = virtqueue_get_buf(vi-svq, len)));
  +++n) {
  pr_debug(Sent skb %p\n, skb);
  vi-dev-stats.tx_bytes += skb-len;
  vi-dev-stats.tx_packets++;
  dev_kfree_skb_any(skb);
  }
  +   return !c;
 
 This is for() abuse :)
 
 Why is the capacity check in there at all?  Surely it's simpler to try
 to free 2 skbs each time around?

This is in case we can't use indirect: we want to free up
enough buffers for the following add_buf to succeed.


for (n = 0; n  2; n++) {
 skb = virtqueue_get_buf(vi-svq, len);
 if (!skb)
 break;
   pr_debug(Sent skb %p\n, skb);
   vi-dev-stats.tx_bytes += skb-len;
   vi-dev-stats.tx_packets++;
   dev_kfree_skb_any(skb);
}
 
   static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
  @@ -574,8 +582,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, 
  struct net_device *dev)
  struct virtnet_info *vi = netdev_priv(dev);
  int capacity;
   
  -   /* Free up any pending old buffers before queueing new ones. */
  -   free_old_xmit_skbs(vi);
  +   /* Free enough pending old buffers to enable queueing new ones. */
  +   free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS);
   
  /* Try to transmit */
  capacity = xmit_skb(vi, skb);
  @@ -609,9 +617,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, 
  struct net_device *dev)
  netif_stop_queue(dev);
  if (unlikely(!virtqueue_enable_cb_delayed(vi-svq))) {
  /* More just got used, free them then recheck. */
  -   free_old_xmit_skbs(vi);
  -   capacity = virtqueue_get_capacity(vi-svq);
  -   if (capacity = 2+MAX_SKB_FRAGS) {
  +   if (!likely(free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
 
 This extra argument to free_old_xmit_skbs seems odd, unless you have
 future plans?
 
 Thanks,
 Rusty.

I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
sure we have enough space in the buffer. Another way to do
that is with a define :).

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-22 Thread Rusty Russell
On Sun, 22 May 2011 15:10:08 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 On Sat, May 21, 2011 at 11:49:59AM +0930, Rusty Russell wrote:
  On Fri, 20 May 2011 02:11:56 +0300, Michael S. Tsirkin m...@redhat.com 
  wrote:
   Current code might introduce a lot of latency variation
   if there are many pending bufs at the time we
   attempt to transmit a new one. This is bad for
   real-time applications and can't be good for TCP either.
  
  Do we have more than speculation to back that up, BTW?
 
 Need to dig this up: I thought we saw some reports of this on the list?

I think so too, but a reference needs to be here too.

It helps to have exact benchmarks on what's being tested, otherwise we
risk unexpected interaction with the other optimization patches.

 struct sk_buff *skb;
 unsigned int len;
   -
   - while ((skb = virtqueue_get_buf(vi-svq, len)) != NULL) {
   + bool c;
   + int n;
   +
   + /* We try to free up at least 2 skbs per one sent, so that we'll get
   +  * all of the memory back if they are used fast enough. */
   + for (n = 0;
   +  ((c = virtqueue_get_capacity(vi-svq)  capacity) || n  2) 
   +  ((skb = virtqueue_get_buf(vi-svq, len)));
   +  ++n) {
 pr_debug(Sent skb %p\n, skb);
 vi-dev-stats.tx_bytes += skb-len;
 vi-dev-stats.tx_packets++;
 dev_kfree_skb_any(skb);
 }
   + return !c;
  
  This is for() abuse :)
  
  Why is the capacity check in there at all?  Surely it's simpler to try
  to free 2 skbs each time around?
 
 This is in case we can't use indirect: we want to free up
 enough buffers for the following add_buf to succeed.

Sure, or we could just count the frags of the skb we're taking out,
which would be accurate for both cases and far more intuitive.

ie. always try to free up twice as much as we're about to put in.

Can we hit problems with OOM?  Sure, but no worse than now...

The problem is that this virtqueue_get_capacity() returns the worst
case, not the normal case.  So using it is deceptive.

 I just wanted to localize the 2+MAX_SKB_FRAGS logic that tries to make
 sure we have enough space in the buffer. Another way to do
 that is with a define :).

To do this properly, we should really be using the actual number of sg
elements needed, but we'd have to do most of xmit_skb beforehand so we
know how many.

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


Re: [PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-20 Thread Rusty Russell
On Fri, 20 May 2011 02:11:56 +0300, Michael S. Tsirkin m...@redhat.com 
wrote:
 Current code might introduce a lot of latency variation
 if there are many pending bufs at the time we
 attempt to transmit a new one. This is bad for
 real-time applications and can't be good for TCP either.

Do we have more than speculation to back that up, BTW?

This patch is pretty sloppy; the previous ones were better polished.

 -static void free_old_xmit_skbs(struct virtnet_info *vi)
 +static bool free_old_xmit_skbs(struct virtnet_info *vi, int capacity)
  {

A comment here indicating it returns true if it frees something?

   struct sk_buff *skb;
   unsigned int len;
 -
 - while ((skb = virtqueue_get_buf(vi-svq, len)) != NULL) {
 + bool c;
 + int n;
 +
 + /* We try to free up at least 2 skbs per one sent, so that we'll get
 +  * all of the memory back if they are used fast enough. */
 + for (n = 0;
 +  ((c = virtqueue_get_capacity(vi-svq)  capacity) || n  2) 
 +  ((skb = virtqueue_get_buf(vi-svq, len)));
 +  ++n) {
   pr_debug(Sent skb %p\n, skb);
   vi-dev-stats.tx_bytes += skb-len;
   vi-dev-stats.tx_packets++;
   dev_kfree_skb_any(skb);
   }
 + return !c;

This is for() abuse :)

Why is the capacity check in there at all?  Surely it's simpler to try
to free 2 skbs each time around?

   for (n = 0; n  2; n++) {
skb = virtqueue_get_buf(vi-svq, len);
if (!skb)
break;
pr_debug(Sent skb %p\n, skb);
vi-dev-stats.tx_bytes += skb-len;
vi-dev-stats.tx_packets++;
dev_kfree_skb_any(skb);
   }

  static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
 @@ -574,8 +582,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct 
 net_device *dev)
   struct virtnet_info *vi = netdev_priv(dev);
   int capacity;
  
 - /* Free up any pending old buffers before queueing new ones. */
 - free_old_xmit_skbs(vi);
 + /* Free enough pending old buffers to enable queueing new ones. */
 + free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS);
  
   /* Try to transmit */
   capacity = xmit_skb(vi, skb);
 @@ -609,9 +617,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct 
 net_device *dev)
   netif_stop_queue(dev);
   if (unlikely(!virtqueue_enable_cb_delayed(vi-svq))) {
   /* More just got used, free them then recheck. */
 - free_old_xmit_skbs(vi);
 - capacity = virtqueue_get_capacity(vi-svq);
 - if (capacity = 2+MAX_SKB_FRAGS) {
 + if (!likely(free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {

This extra argument to free_old_xmit_skbs seems odd, unless you have
future plans?

Thanks,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization


[PATCHv2 10/14] virtio_net: limit xmit polling

2011-05-19 Thread Michael S. Tsirkin
Current code might introduce a lot of latency variation
if there are many pending bufs at the time we
attempt to transmit a new one. This is bad for
real-time applications and can't be good for TCP either.

Free up just enough to both clean up all buffers
eventually and to be able to xmit the next packet.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
---
 drivers/net/virtio_net.c |   22 ++
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f33c92b..42935cb 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -509,17 +509,25 @@ again:
return received;
 }
 
-static void free_old_xmit_skbs(struct virtnet_info *vi)
+static bool free_old_xmit_skbs(struct virtnet_info *vi, int capacity)
 {
struct sk_buff *skb;
unsigned int len;
-
-   while ((skb = virtqueue_get_buf(vi-svq, len)) != NULL) {
+   bool c;
+   int n;
+
+   /* We try to free up at least 2 skbs per one sent, so that we'll get
+* all of the memory back if they are used fast enough. */
+   for (n = 0;
+((c = virtqueue_get_capacity(vi-svq)  capacity) || n  2) 
+((skb = virtqueue_get_buf(vi-svq, len)));
+++n) {
pr_debug(Sent skb %p\n, skb);
vi-dev-stats.tx_bytes += skb-len;
vi-dev-stats.tx_packets++;
dev_kfree_skb_any(skb);
}
+   return !c;
 }
 
 static int xmit_skb(struct virtnet_info *vi, struct sk_buff *skb)
@@ -574,8 +582,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct 
net_device *dev)
struct virtnet_info *vi = netdev_priv(dev);
int capacity;
 
-   /* Free up any pending old buffers before queueing new ones. */
-   free_old_xmit_skbs(vi);
+   /* Free enough pending old buffers to enable queueing new ones. */
+   free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS);
 
/* Try to transmit */
capacity = xmit_skb(vi, skb);
@@ -609,9 +617,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct 
net_device *dev)
netif_stop_queue(dev);
if (unlikely(!virtqueue_enable_cb_delayed(vi-svq))) {
/* More just got used, free them then recheck. */
-   free_old_xmit_skbs(vi);
-   capacity = virtqueue_get_capacity(vi-svq);
-   if (capacity = 2+MAX_SKB_FRAGS) {
+   if (!likely(free_old_xmit_skbs(vi, 2+MAX_SKB_FRAGS))) {
netif_start_queue(dev);
virtqueue_disable_cb(vi-svq);
}
-- 
1.7.5.53.gc233e

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/virtualization