On 2021/02/09 4:13, William Tu wrote:
On Mon, Feb 8, 2021 at 8:57 AM Ilya Maximets <i.maxim...@ovn.org> wrote:

On 2/6/21 5:15 PM, William Tu wrote:
On Mon, Feb 1, 2021 at 5:48 PM Yi Yang (杨燚)-云服务集团 <yangy...@inspur.com> wrote:

Thanks Ilya, net_tap PMD is handling tap device on host side, so it can 
leverage vnet header to do TSO/GSO, maybe net_pmd authors don't know how to do 
this, from source code, tap fd isn't enabled vnet header and TSO.

thanks, learned a lot from these discussions.

I looked at the DPDK net_tap and indeed it doesn't support virtio net hdr.
Do you guys think it makes sense to add TSO at dpdk net_tap?
Or simply using the current OVS's userspace-enable-tso on tap/veth is
good enough?
(using type=system, not using dpdk port type on tap/veth.)

Regards,
William


I didn't benchmark all types of interfaces, but I'd say that, if you
need more or less high performance solution for userspace<->kernel
communication, you should, probably, take a look at virtio-user
ports with vhost kernel backend:
   https://doc.dpdk.org/guides/howto/virtio_user_as_exceptional_path.html
This should be the fastest and also feature-rich solution.
Thanks! I will give it a try.

Tap devices are not designed for high performance in general,
so I'd not suggest any of them for highly loaded ports.
If it's only for some small management traffic, it should be fine
to just use netdev-linux implementation.

That's what I thought until Flavio enables vnet header.

netdev-afxdp with pmd or non-pmd modes on a veth devices is another
(potentially high performance) solution.

When testing intra-host container to container performance,
Tap device becomes much faster than netdev-afxdp, especially with iperf TCP.
Mostly due to vnet header's TSO and csum offload feature.
It's a big limitation for XDP frame which couldn't carry large buffer or carry
the partial csum information.

I reach a conclusion that for intra-host container to container
TCP performance, from the fastest configuration to slowest (ns: namespace)
0) dpdk vhostuser in ns0 -> vhostuer - OVS userspace
(But requires TCP in userspace and application modification)
1) veth0 in ns0 -> veth with TSO - OVS kernel module - veth with TSO
-> veth1 in ns1
2) tap0 in ns0 -> virtio_user - OVS userspace - virtio_user -> tap1 in ns1
3) tap0 in ns0 -> recv_tap - OVS with userspace-tso - tap_batch_send
-> tap1 in ns1
4) veth0 in ns0 -> af_packet sock - OVS with userspace-tso -
af_packet_sock -> veth1 in ns1
5) veth0 in ns0 -> netdev-afxdp - OVS - netdev-afxdp -> veth1 in ns1

I also tested Toshiaki's XDP offload patch,
https://www.mail-archive.com/ovs-dev@openvswitch.org/msg45930.html
I would guess it's between 2 to 4.

Note that veth native XDP is fast when all of packet processing is done in XDP 
world.
That means packets generated in containers are not fast even with native veth 
XDP.
The fast case is that a phy device XDP_REDIRECTs frames to veth, and then the 
peer
veth in ns does something and XDP_TXes the frame, and then REDIRECT it to 
another veth pair.

phy --XDP_REDIRECT--> veth-host0 --> veth-ns0 --XDP_TX--> veth-host0 
--XDP_REDIRECT--> veth-host1 --> veth-ns1

Having said that, missing TSO is indeed a big limitation.
BTW there is some progress on TSO in XDP world...
https://patchwork.kernel.org/project/netdevbpf/cover/cover.1611086134.git.lore...@kernel.org/

Toshiaki Makita
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to