On Fri, Jul 29, 2022 at 10:38:36PM +0200, Ilya Maximets wrote:
> OVS kernel datapath and TC are parsing IPv6 fragments differently.
> For IPv6 later fragments, according to the original design [1], OVS
> always sets nw_proto=44 (IPPROTO_FRAMENT), regardless of the type
> of the L4 protocol.
>
> This leads to situation where flow for nw_proto=44 gets installed
> to TC, but packets can not match on it, causing all IPv6 later
> fragments to go to userspace significantly degrading performance.
>
> Disabling offload for such packets, so the flow can be installed
> to the OVS kernel datapath instead.  Disabling for all IPv6 fragments
> including the first one, because it doesn't make a lot of sense to
> handle them separately.  It may also cause potential problems with
> conntrack trying to re-assemble a packet from fragments handled by
> different datapaths (first in HW, later in OVS kernel).
>
> Checking both 'nw_proto' and 'nw_frag' as classifier might decide
> to match only on one of them and also nw_proto will not be 44 for
> the first fragment.
>
> The issue was hidden for some time due to incorrect behavior of the
> OVS kernel datapath that was recently fixed in kernel commit:
>
>  12378a5a75e3 ("net: openvswitch: fix parsing of nw_proto for IPv6 fragments")
>
> To allow offloading in the future either flow dissector in TC
> should be changed to parse packets in the same way as OVS does,
> or parsing in OVS kernel and userspace should be made configurable,
> so users can opt-in to the behavior change.  Silent change of the
> behavior (change by default) is not an option, because existing
> OpenFlow pipelines may depend on a certain behavior.
>
> [1] https://docs.openvswitch.org/en/latest/topics/design/#fragments
>
> Fixes: 83e866067ea6 ("netdev-tc-offloads: Add support for IP fragmentation")
> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>

Reviewed-by: Marcelo Ricardo Leitner <mleit...@redhat.com>

> ---
>  lib/netdev-offload-tc.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 29d0e63eb..9d37881a1 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1459,6 +1459,21 @@ parse_put_flow_set_action(struct tc_flower *flower, 
> struct tc_action *action,
>      return 0;
>  }
>
> +static bool
> +is_ipv6_fragment_and_masked(const struct flow *key, const struct flow *mask)
> +{
> +  if (key->dl_type != htons(ETH_P_IPV6)) {
> +      return false;
> +  }
> +  if (mask->nw_proto && key->nw_proto == IPPROTO_FRAGMENT) {
> +      return true;
> +  }
> +  if (key->nw_frag & (mask->nw_frag & FLOW_NW_FRAG_ANY)) {
> +      return true;
> +  }
> +  return false;
> +}
> +
>  static int
>  test_key_and_mask(struct match *match)
>  {
> @@ -1541,6 +1556,11 @@ test_key_and_mask(struct match *match)
>          return EOPNOTSUPP;
>      }
>
> +    if (is_ipv6_fragment_and_masked(key, mask)) {
> +        VLOG_DBG_RL(&rl, "offloading of IPv6 fragments isn't supported");
> +        return EOPNOTSUPP;
> +    }
> +
>      if (!is_all_zeros(mask, sizeof *mask)) {
>          VLOG_DBG_RL(&rl, "offloading isn't supported, unknown attribute");
>          return EOPNOTSUPP;
> @@ -2099,7 +2119,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct match 
> *match,
>              memset(&mask->arp_tha, 0, sizeof mask->arp_tha);
>      }
>
> -    if (is_ip_any(key)) {
> +    if (is_ip_any(key) && !is_ipv6_fragment_and_masked(key, mask)) {
>          flower.key.ip_proto = key->nw_proto;
>          flower.mask.ip_proto = mask->nw_proto;
>          mask->nw_proto = 0;
> --
> 2.34.3
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to