Nothing significantly changed since the previous versions.
This patch set effectively breaks following cases if multi-segment
mbufs enabled:

  1. Host <-> VM communication broken.

     With this patch applied VMs has offloading enabled by default.
     This makes all the packets that goes out from the VM to have
     incorrect checksums (beside possible multi-segmenting). As a
     result any packet that reaches host interfaces dropped by the
     host kernel as corrupted. Could be easily reproduced by trying
     to use ssh from host to VM, which is the common case for usual
     OpenStack installations.

     If checksums will be fixed, segmented packets will be dropped
     anyway in netdev-linux/bsd/dummy.

  2. Broken VM to VM communication in case of not negotiated offloading
     in one of VMs.

     If one of VMs does not negotiate offloading, all the packets
     will be dropped for the same reason as in Host to VM case. This
     is a really bad case because virtio driver could change the
     feature set in runtime and this should be handled in OVS.

     Note that linux kernel virtio driver always has rx checksum
     offloading enabled, i.e. it doesn't check/recalculates the
     checksums in software if needed. So, you need the dpdk driver or
     a FreeBSD machine to reproduce checksumming issue.

  3. Broken support of all the virtual and HW NICs that doen't support TSO.

     It's actually big number of NICs. More than a half of PMDs in DPDK
     does not support TSO.
     All the packets just dropped by netdev-dpdk before send.

This is most probably not a full list of issues. To fix most of them
fallback to full software TSO implementation required.

Until this is done I prefer not to merge the feature because it
breaks the basic functionality. I understand that you're going to make
it 'experimental', but, IMHO, this doesn't worth even that. Patches
are fairly small and there is nothing actually we need to test before
the full support. The main part is not implemented yet.

NACK for the series. Sorry.

----

One more thing I wanted to mention is that I was surprised to not see
the performance comparison of TSO with usual Jumbo frames in your slides
on a recent OVS Conference [1]. You had a comparison slide with 1500 MTU,
TSO and the kernel vhost, but not the jumbo frames. In my testing
I have more than 22 Gbps for VM<->VM jumbo frames performance with
current OVS and iperf. And this is almost equal to vhost kernel with TSO
performance on my system (I have probably a bit less powerful CPU, so the
results are a bit lower than in your slides). Also, in real cases there will
be significant performance drop because of fallback to software TSO in
many cases. Sorry, but I don't really see the benefits of using the
Multi-segment mbufs and TSO.
All of this work positioned as a feature to cover the gap between the kernel
and userspace datapaths, but there is no any gap from the performance point
of view and overcomplicating the OVS by multisegmenting and TSO is really
not necessary.

[1] http://www.openvswitch.org/support/ovscon2018/5/0935-lam.pptx


On 10.01.2019 19:58, Tiago Lam wrote:
> Enabling TSO offload allows a host stack to delegate the segmentation of
> oversized TCP packets to the underlying physical NIC, if supported. In the 
> case
> of a VM this means that the segmentation of the packets is not performed by 
> the
> guest kernel, but by the host NIC itself. In turn, since the TSO calculations
> and checksums are being performed in hardware, this alleviates the CPU load on
> the host system. In inter VM communication this might account to significant
> savings, and higher throughput, even more so if the VMs are running on the 
> same
> host.
> 
> Thus, although inter VM communication is already possible as is, there's a
> sacrifice in terms of CPU, which may affect the overall throughput.
> 
> This series adds support for TSO in OvS-DPDK, by making use of the TSO
> offloading feature already supported by DPDK vhost backend, having the
> following scenarios in mind:
> - Inter VM communication on the same host;
> - Inter VM communication on different hosts;
> - The same two use cases above, but on a VLAN network.
> 
> The work is based on [1]; It has been rebased to run on top of the
> multi-segment mbufs work (v13) [2] and re-worked to use DPDK v18.11.
> 
> [1] https://patchwork.ozlabs.org/patch/749564/
> [2] https://mail.openvswitch.org/pipermail/ovs-dev/2019-January/354950.html
> 
> Considerations:
> - As mentioned above, this series depends on the multi-segment mbuf series
>   (v13) and can't be applied on master as is;
> - The `rte_eth_tx_prepare()` API in DPDK is marked experimental, and although
>   I'm not getting any errors / warnings while compiling, do shout if get into
>   trouble while testing;
> - I'm due to send v3 in the next few days, but sending v2 now to enable early
>   testing;
> 
> Tiago Lam (3):
>   netdev-dpdk: Validate packets burst before Tx.
>   netdev-dpdk: Consider packets marked for TSO.
>   netdev-dpdk: Enable TSO when using multi-seg mbufs
> 
>  Documentation/automake.mk           |   1 +
>  Documentation/topics/dpdk/index.rst |   1 +
>  Documentation/topics/dpdk/tso.rst   | 111 ++++++++++++++++++++
>  NEWS                                |   1 +
>  lib/dp-packet.h                     |  14 +++
>  lib/netdev-bsd.c                    |  11 +-
>  lib/netdev-dpdk.c                   | 203 
> ++++++++++++++++++++++++++++++------
>  lib/netdev-dummy.c                  |  11 +-
>  lib/netdev-linux.c                  |  15 +++
>  9 files changed, 332 insertions(+), 36 deletions(-)
>  create mode 100644 Documentation/topics/dpdk/tso.rst
> 
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to