On Wed, May 15, 2024 at 2:09 PM Kevin Traynor <ktray...@redhat.com> wrote:
>
> On 19/04/2024 15:06, David Marchand wrote:
> > In a typical setup like:
> > guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B
> >
> > TSO packets from guest A are segmented against the OVS A physical port
> > mtu adjusted by the vxlan tunnel header size, regardless of guest A
> > interface mtu.
> >
> > As an example, let's say guest A and guest B mtu are set to 1500 bytes.
> > OVS A and OVS B physical ports mtu are set to 1600 bytes.
> > Guest A will request TCP segmentation for 1448 bytes segments.
> > On the other hand, OVS A will request 1498 bytes segments to the HW.
> > This results in OVS B dropping packets because decapsulated packets
> > are larger than the vhost-user port (serving guest B) mtu.
> >
> > 2024-04-17T14:13:01.239Z|00002|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0:
> >       Too big size 1564 max_packet_len 1518
> >
> > vhost-user ports expose a guest mtu by filling mbuf->tso_segsz.
> > Use it as a hint.
> >
> > This may result in segments (on the wire) slightly shorter than the
> > optimal size.
> >
> > Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
> > Signed-off-by: David Marchand <david.march...@redhat.com>
> > ---
> > Note:
> > As we trust the guest with this change, should we put a lower limit on
> > mbuf->tso_segsz?
> >
>
> There are some checks I looked at (e.g [0]), but it could be checked
> here for an earlier drop i suppose...additional comment below
>
> [0]
> https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754

I guess you meant
https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3818
And same in v23.11, there are checks at the prepare stage:
https://git.dpdk.org/dpdk-stable/tree/drivers/net/ice/ice_rxtx.c?h=23.11#n3681

I had forgotten about those checks.
There is no limit exposed per driver from DPDK, so the simpler for OVS
is to trust them.


>
> > ---
> >  lib/netdev-dpdk.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index 661269e4b6..1dad2ef833 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk 
> > *dev, struct rte_mbuf *mbuf)
> >
> >      if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
> >          struct tcp_header *th = dp_packet_l4(pkt);
> > +        uint16_t link_tso_segsz;
> >          int hdr_len;
> >
> >          if (tunnel_type) {
> > -            mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> > -                              mbuf->l4_len - mbuf->outer_l3_len;
> > +            link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> > +                             mbuf->l4_len - mbuf->outer_l3_len;
> >          } else {
> >              mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
> > -            mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> > +            link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> > +        }
> > +
> > +        if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) {
>
> It seems like something is not right if the flag is set but tso_segsz is
> 0. It is set by vhost lib from gso_size, but I don't see a validation
> there either.

At the time I added a check on the 0 value, I thought there was a case
where RTE_MBUF_F_TX_TCP_SEG could be set with no segsz value.
But as you mention, all setters of this flag (either in vhost or in
OVS) set a segsz too.

So with segsz always set, combined with the drivers check, OVS
probably does not need any check on tso_segsz.
I intend to remove this check in a next revision.


-- 
David Marchand

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to