On 19/04/2024 15:06, David Marchand wrote:
> In a typical setup like:
> guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B
> 
> TSO packets from guest A are segmented against the OVS A physical port
> mtu adjusted by the vxlan tunnel header size, regardless of guest A
> interface mtu.
> 
> As an example, let's say guest A and guest B mtu are set to 1500 bytes.
> OVS A and OVS B physical ports mtu are set to 1600 bytes.
> Guest A will request TCP segmentation for 1448 bytes segments.
> On the other hand, OVS A will request 1498 bytes segments to the HW.
> This results in OVS B dropping packets because decapsulated packets
> are larger than the vhost-user port (serving guest B) mtu.
> 
> 2024-04-17T14:13:01.239Z|00002|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0:
>       Too big size 1564 max_packet_len 1518
> 
> vhost-user ports expose a guest mtu by filling mbuf->tso_segsz.
> Use it as a hint.
> 
> This may result in segments (on the wire) slightly shorter than the
> optimal size.
> 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
> Signed-off-by: David Marchand <david.march...@redhat.com>
> ---
> Note:
> As we trust the guest with this change, should we put a lower limit on
> mbuf->tso_segsz?
> 

There are some checks I looked at (e.g [0]), but it could be checked
here for an earlier drop i suppose...additional comment below

[0]
https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754

> ---
>  lib/netdev-dpdk.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 661269e4b6..1dad2ef833 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> struct rte_mbuf *mbuf)
>  
>      if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
>          struct tcp_header *th = dp_packet_l4(pkt);
> +        uint16_t link_tso_segsz;
>          int hdr_len;
>  
>          if (tunnel_type) {
> -            mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> -                              mbuf->l4_len - mbuf->outer_l3_len;
> +            link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> +                             mbuf->l4_len - mbuf->outer_l3_len;
>          } else {
>              mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
> -            mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> +            link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> +        }
> +
> +        if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) {

It seems like something is not right if the flag is set but tso_segsz is
0. It is set by vhost lib from gso_size, but I don't see a validation
there either.

So it could be checked and/or adjusted here for obviously incorrect
values. The only question would be is it the right layer to check these,
or is it more appropriate for it to be done in the drivers.

> +            mbuf->tso_segsz = link_tso_segsz;
>          }
>  
>          hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len;

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to