On Fri, Jan 11, 2019 at 8:22 AM Ilya Maximets <i.maxim...@samsung.com> wrote:
> Nothing significantly changed since the previous versions. > This patch set effectively breaks following cases if multi-segment > mbufs enabled: > > 1. Host <-> VM communication broken. > > With this patch applied VMs has offloading enabled by default. > This makes all the packets that goes out from the VM to have > incorrect checksums (beside possible multi-segmenting). As a > result any packet that reaches host interfaces dropped by the > host kernel as corrupted. Could be easily reproduced by trying > to use ssh from host to VM, which is the common case for usual > OpenStack installations. > > If checksums will be fixed, segmented packets will be dropped > anyway in netdev-linux/bsd/dummy. > > 2. Broken VM to VM communication in case of not negotiated offloading > in one of VMs. > > If one of VMs does not negotiate offloading, all the packets > will be dropped for the same reason as in Host to VM case. This > is a really bad case because virtio driver could change the > feature set in runtime and this should be handled in OVS. > > Note that linux kernel virtio driver always has rx checksum > offloading enabled, i.e. it doesn't check/recalculates the > checksums in software if needed. So, you need the dpdk driver or > a FreeBSD machine to reproduce checksumming issue. > > 3. Broken support of all the virtual and HW NICs that doen't support TSO. > > It's actually big number of NICs. More than a half of PMDs in DPDK > does not support TSO. > All the packets just dropped by netdev-dpdk before send. > > This is most probably not a full list of issues. To fix most of them > fallback to full software TSO implementation required. > > Until this is done I prefer not to merge the feature because it > breaks the basic functionality. I understand that you're going to make > it 'experimental', but, IMHO, this doesn't worth even that. Patches > are fairly small and there is nothing actually we need to test before > the full support. The main part is not implemented yet. > > NACK for the series. Sorry. > > ---- > > One more thing I wanted to mention is that I was surprised to not see > the performance comparison of TSO with usual Jumbo frames in your slides > on a recent OVS Conference [1]. You had a comparison slide with 1500 MTU, > TSO and the kernel vhost, but not the jumbo frames. In my testing > I have more than 22 Gbps for VM<->VM jumbo frames performance with > current OVS and iperf. And this is almost equal to vhost kernel with TSO > performance on my system (I have probably a bit less powerful CPU, so the > results are a bit lower than in your slides). Also, in real cases there > will > be significant performance drop because of fallback to software TSO in > many cases. Sorry, but I don't really see the benefits of using the > Multi-segment mbufs and TSO. > Tiago I am also interested in the performance benefit question. Could you answer it ? Thanks Darrell > All of this work positioned as a feature to cover the gap between the > kernel > and userspace datapaths, but there is no any gap from the performance point > of view Tiago Is this true ? Thanks Darrell > and overcomplicating the OVS by multisegmenting and TSO is really > not necessary. > > > [1] http://www.openvswitch.org/support/ovscon2018/5/0935-lam.pptx > > > On 10.01.2019 19:58, Tiago Lam wrote: > > Enabling TSO offload allows a host stack to delegate the segmentation of > > oversized TCP packets to the underlying physical NIC, if supported. In > the case > > of a VM this means that the segmentation of the packets is not performed > by the > > guest kernel, but by the host NIC itself. In turn, since the TSO > calculations > > and checksums are being performed in hardware, this alleviates the CPU > load on > > the host system. In inter VM communication this might account to > significant > > savings, and higher throughput, even more so if the VMs are running on > the same > > host. > > > > Thus, although inter VM communication is already possible as is, there's > a > > sacrifice in terms of CPU, which may affect the overall throughput. > > > > This series adds support for TSO in OvS-DPDK, by making use of the TSO > > offloading feature already supported by DPDK vhost backend, having the > > following scenarios in mind: > > - Inter VM communication on the same host; > > - Inter VM communication on different hosts; > > - The same two use cases above, but on a VLAN network. > > > > The work is based on [1]; It has been rebased to run on top of the > > multi-segment mbufs work (v13) [2] and re-worked to use DPDK v18.11. > > > > [1] https://patchwork.ozlabs.org/patch/749564/ > > [2] > https://mail.openvswitch.org/pipermail/ovs-dev/2019-January/354950.html > > > > Considerations: > > - As mentioned above, this series depends on the multi-segment mbuf > series > > (v13) and can't be applied on master as is; > > - The `rte_eth_tx_prepare()` API in DPDK is marked experimental, and > although > > I'm not getting any errors / warnings while compiling, do shout if get > into > > trouble while testing; > > - I'm due to send v3 in the next few days, but sending v2 now to enable > early > > testing; > > > > Tiago Lam (3): > > netdev-dpdk: Validate packets burst before Tx. > > netdev-dpdk: Consider packets marked for TSO. > > netdev-dpdk: Enable TSO when using multi-seg mbufs > > > > Documentation/automake.mk | 1 + > > Documentation/topics/dpdk/index.rst | 1 + > > Documentation/topics/dpdk/tso.rst | 111 ++++++++++++++++++++ > > NEWS | 1 + > > lib/dp-packet.h | 14 +++ > > lib/netdev-bsd.c | 11 +- > > lib/netdev-dpdk.c | 203 > ++++++++++++++++++++++++++++++------ > > lib/netdev-dummy.c | 11 +- > > lib/netdev-linux.c | 15 +++ > > 9 files changed, 332 insertions(+), 36 deletions(-) > > create mode 100644 Documentation/topics/dpdk/tso.rst > > > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev