Hi,

On 08/08/2016 04:37 PM, Maxime Coquelin wrote:
This patch tries to address a packet drop issue on the Rx path,
which may happen when VM's vCPU is not 1:1 mapped to a physical CPU,
or when the VMs overcommits memory.

In these configurations, the VM is paused for several milliseconds,
resulting in the Rx virtqueue not to be emptied often enought.

This problem can be seen with small packet rates. For example at 0.3Mpps
with a virtqueue of 256, if the VM is paused for 4ms every second,
around 550 packets get dropped per second.
Increasing the virtqueue size to 1024, no more drops are seen.

The first option to solve this issue is to enlarge the virtqueue ring
size, which is being done in Qemu [0]. However, the virtqueue size
cannot exceeds the 1024 limit imposed by S/G lists in the kernel.

This patch introduce SW rings before Vhost Rx virtqueues.
A new "sw_ring_size" option allows to set the ring size at runtime
(0 for disabling it, RTE ring size have to be a power of 2).

For example, to add a SW ring of 1024 elements:
$> set Interface vhost-user1 options:sw_ring_size=1024
Which can be disabled at runtime doing:
$> set Interface vhost-user1 options:sw_ring_size=0

If no SW ring size provided, the behaviour is the same as before this patch.
If SW ring size is set, there are two paths when sending packets:
 1 - The send function has the ownership of the packets (may_steal)
  In this case, we can enqueue the packets and possibly send them later.
  First we enqueue the packets in the SW ring, if SW ring full,
packets get dropped. Then loop to empty the SW ring into the Virtqueue.
  Packets are freed as soon as inserted into the virtqueue.
 2 - The send function doesn't have the ownership of the packets (!may_steal)
  In this case, we cannot enqueue the packets, as we won't have their
ownership as soon as we exit the send function.
  First, we try to empty the SW ring into the virtqueue. If SW ring has
successfully been emptied, we try to send the packets directly into the
virtqueue.

Note that this patch has not been heavily tested, this is more a PoC to
gather your feedback on the feature, and see if better ideas arise.

[0]: https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg0073dd0.html

Signed-off-by: Maxime Coquelin <maxime.coque...@redhat.com>
---
 lib/netdev-dpdk.c | 171 ++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 148 insertions(+), 23 deletions(-)


This is a gentle reminder.
The patch should be much more simple thanks to reworks done by Ilya
Maximets, but I would appreciate your feedback before rebasing it.

Thanks,
Maxime

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to