On Fri, Jul 29, 2022 at 10:38:36PM +0200, Ilya Maximets wrote: > OVS kernel datapath and TC are parsing IPv6 fragments differently. > For IPv6 later fragments, according to the original design [1], OVS > always sets nw_proto=44 (IPPROTO_FRAMENT), regardless of the type > of the L4 protocol. > > This leads to situation where flow for nw_proto=44 gets installed > to TC, but packets can not match on it, causing all IPv6 later > fragments to go to userspace significantly degrading performance. > > Disabling offload for such packets, so the flow can be installed > to the OVS kernel datapath instead. Disabling for all IPv6 fragments > including the first one, because it doesn't make a lot of sense to > handle them separately. It may also cause potential problems with > conntrack trying to re-assemble a packet from fragments handled by > different datapaths (first in HW, later in OVS kernel). > > Checking both 'nw_proto' and 'nw_frag' as classifier might decide > to match only on one of them and also nw_proto will not be 44 for > the first fragment. > > The issue was hidden for some time due to incorrect behavior of the > OVS kernel datapath that was recently fixed in kernel commit: > > 12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments") > > To allow offloading in the future either flow dissector in TC > should be changed to parse packets in the same way as OVS does, > or parsing in OVS kernel and userspace should be made configurable, > so users can opt-in to the behavior change. Silent change of the > behavior (change by default) is not an option, because existing > OpenFlow pipelines may depend on a certain behavior. > > [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments > > Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation") > Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
Reviewed-by: Marcelo Ricardo Leitner <mleit...@redhat.com> > --- > lib/netdev-offload-tc.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c > index 29d0e63eb..9d37881a1 100644 > --- a/lib/netdev-offload-tc.c > +++ b/lib/netdev-offload-tc.c > @@ -1459,6 +1459,21 @@ parse_put_flow_set_action(struct tc_flower *flower, > struct tc_action *action, > return 0; > } > > +static bool > +is_ipv6_fragment_and_masked(const struct flow *key, const struct flow *mask) > +{ > + if (key->dl_type != htons(ETH_P_IPV6)) { > + return false; > + } > + if (mask->nw_proto && key->nw_proto == IPPROTO_FRAGMENT) { > + return true; > + } > + if (key->nw_frag & (mask->nw_frag & FLOW_NW_FRAG_ANY)) { > + return true; > + } > + return false; > +} > + > static int > test_key_and_mask(struct match *match) > { > @@ -1541,6 +1556,11 @@ test_key_and_mask(struct match *match) > return EOPNOTSUPP; > } > > + if (is_ipv6_fragment_and_masked(key, mask)) { > + VLOG_DBG_RL(&rl, "offloading of IPv6 fragments isn't supported"); > + return EOPNOTSUPP; > + } > + > if (!is_all_zeros(mask, sizeof *mask)) { > VLOG_DBG_RL(&rl, "offloading isn't supported, unknown attribute"); > return EOPNOTSUPP; > @@ -2099,7 +2119,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match > *match, > memset(&mask->arp_tha, 0, sizeof mask->arp_tha); > } > > - if (is_ip_any(key)) { > + if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) { > flower.key.ip_proto = key->nw_proto; > flower.mask.ip_proto = mask->nw_proto; > mask->nw_proto = 0; > -- > 2.34.3 > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev