On Wed, May 15, 2024 at 2:09 PM Kevin Traynor <ktray...@redhat.com> wrote: > > On 19/04/2024 15:06, David Marchand wrote: > > In a typical setup like: > > guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B > > > > TSO packets from guest A are segmented against the OVS A physical port > > mtu adjusted by the vxlan tunnel header size, regardless of guest A > > interface mtu. > > > > As an example, let's say guest A and guest B mtu are set to 1500 bytes. > > OVS A and OVS B physical ports mtu are set to 1600 bytes. > > Guest A will request TCP segmentation for 1448 bytes segments. > > On the other hand, OVS A will request 1498 bytes segments to the HW. > > This results in OVS B dropping packets because decapsulated packets > > are larger than the vhost-user port (serving guest B) mtu. > > > > 2024-04-17T14:13:01.239Z|00002|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0: > > Too big size 1564 max_packet_len 1518 > > > > vhost-user ports expose a guest mtu by filling mbuf->tso_segsz. > > Use it as a hint. > > > > This may result in segments (on the wire) slightly shorter than the > > optimal size. > > > > Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 > > Signed-off-by: David Marchand <david.march...@redhat.com> > > --- > > Note: > > As we trust the guest with this change, should we put a lower limit on > > mbuf->tso_segsz? > > > > There are some checks I looked at (e.g [0]), but it could be checked > here for an earlier drop i suppose...additional comment below > > [0] > https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754
I guess you meant https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3818 And same in v23.11, there are checks at the prepare stage: https://git.dpdk.org/dpdk-stable/tree/drivers/net/ice/ice_rxtx.c?h=23.11#n3681 I had forgotten about those checks. There is no limit exposed per driver from DPDK, so the simpler for OVS is to trust them. > > > --- > > lib/netdev-dpdk.c | 11 ++++++++--- > > 1 file changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 661269e4b6..1dad2ef833 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk > > *dev, struct rte_mbuf *mbuf) > > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { > > struct tcp_header *th = dp_packet_l4(pkt); > > + uint16_t link_tso_segsz; > > int hdr_len; > > > > if (tunnel_type) { > > - mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > > - mbuf->l4_len - mbuf->outer_l3_len; > > + link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > > + mbuf->l4_len - mbuf->outer_l3_len; > > } else { > > mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; > > - mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > > + link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > > + } > > + > > + if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) { > > It seems like something is not right if the flag is set but tso_segsz is > 0. It is set by vhost lib from gso_size, but I don't see a validation > there either. At the time I added a check on the 0 value, I thought there was a case where RTE_MBUF_F_TX_TCP_SEG could be set with no segsz value. But as you mention, all setters of this flag (either in vhost or in OVS) set a segsz too. So with segsz always set, combined with the drivers check, OVS probably does not need any check on tso_segsz. I intend to remove this check in a next revision. -- David Marchand _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev