On 8/5/22 13:00, Roi Dayan wrote: > > > On 2022-08-04 6:20 PM, Ilya Maximets wrote: >> On 8/2/22 09:13, Roi Dayan wrote: >>> >>> >>> On 2022-08-02 9:53 AM, Roi Dayan wrote: >>>> >>>> >>>> On 2022-08-01 9:31 AM, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2022-07-29 5:53 PM, Ilya Maximets wrote: >>>>>> Current offloading code supports only limited number of tunnel keys >>>>>> and silently ignores everything it doesn't understand. This is >>>>>> causing, for example, offloaded ERSPAN tunnels to not work, because >>>>>> flow is offloaded, but ERSPAN options are not provided to TC. >>>>>> >>>>>> There is a number of tunnel keys, which are supported by the userspace, >>>>>> but silently ignored during offloading: >>>>>> >>>>>> OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT >>>>>> OVS_TUNNEL_KEY_ATTR_OAM >>>>>> OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS >>>>>> OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS >>>>>> >>>>>> OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions >>>>>> and for some reason is set from the tunnel port instead of the >>>>>> provided action, and not currently supported for the tunnel key in >>>>>> the match. >>>>>> >>>>>> Addig a default case to fail offloading of unknown attributes. For >>>>>> now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag, >>>>>> otherwise we'll break all tunnel offloading by default. VXLAN and >>>>>> ERSPAN options has to fail offloading, because the tunnel will not >>>>>> work otherwise. OAM is not a default configurations, so failing it >>>>>> as well. The missing DONT_FRAGMENT flag though should, probably, >>>>>> cause frequent flow revalidation, but that is not new with this patch. >>>>>> >>>>>> Same for the 'match' key, only clearing masks that was actually >>>>>> consumed, except for the DONT_FRAGMENT and CSUM flags, which are >>>>>> explicitly allowed and highlighted as broken. >>>>>> >>>>>> Also, destination port as well as CSUM configuration for unknown >>>>>> reason was not taken from the actions list and were passed via HW >>>>>> offload info instead of being consumed from the set() action. >>>>>> >>>>>> Reported-at: >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2022-July%2F395522.html&data=05%7C01%7Croid%40nvidia.com%7C0b0fe4ddff284d603b3008da762cd0e0%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637952232106012230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uhL1ll5%2FnI1e%2F9OrcSGVfECBHL4r%2B%2FRg1VIcoyy3zas%3D&reserved=0 >>>>>> Reported-by: Eelco Chaudron <echau...@redhat.com> >>>>>> Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put >>>>>> using tc interface") >>>>>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> >>>>>> --- >>>>>> lib/dpif-netlink.c | 14 +--------- >>>>>> lib/netdev-offload-tc.c | 61 ++++++++++++++++++++++++++++++++++++----- >>>>>> lib/netdev-offload.h | 3 -- >>>>>> 3 files changed, 55 insertions(+), 23 deletions(-) >>>>>> >>>>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c >>>>>> index a498d5667..d9243a735 100644 >>>>>> --- a/lib/dpif-netlink.c >>>>>> +++ b/lib/dpif-netlink.c >>>>>> @@ -2237,8 +2237,6 @@ parse_flow_put(struct dpif_netlink *dpif, struct >>>>>> dpif_flow_put *put) >>>>>> size_t left; >>>>>> struct netdev *dev; >>>>>> struct offload_info info; >>>>>> - ovs_be16 dst_port = 0; >>>>>> - uint8_t csum_on = false; >>>>>> int err; >>>>>> info.tc_modify_flow_deleted = false; >>>>>> @@ -2258,10 +2256,9 @@ parse_flow_put(struct dpif_netlink *dpif, struct >>>>>> dpif_flow_put *put) >>>>>> return EOPNOTSUPP; >>>>>> } >>>>>> - /* Get tunnel dst port */ >>>>>> + /* Check the output port for a tunnel. */ >>>>>> NL_ATTR_FOR_EACH(nla, left, put->actions, put->actions_len) { >>>>>> if (nl_attr_type(nla) == OVS_ACTION_ATTR_OUTPUT) { >>>>>> - const struct netdev_tunnel_config *tnl_cfg; >>>>>> struct netdev *outdev; >>>>>> odp_port_t out_port; >>>>>> @@ -2271,19 +2268,10 @@ parse_flow_put(struct dpif_netlink *dpif, struct >>>>>> dpif_flow_put *put) >>>>>> err = EOPNOTSUPP; >>>>>> goto out; >>>>>> } >>>>>> - tnl_cfg = netdev_get_tunnel_config(outdev); >>>>>> - if (tnl_cfg && tnl_cfg->dst_port != 0) { >>>>>> - dst_port = tnl_cfg->dst_port; >>>>>> - } >>>>>> - if (tnl_cfg) { >>>>>> - csum_on = tnl_cfg->csum; >>>>>> - } >>>>>> netdev_close(outdev); >>>>>> } >>>>>> } >>>>>> - info.tp_dst_port = dst_port; >>>>>> - info.tunnel_csum_on = csum_on; >>>>>> info.recirc_id_shared_with_tc = (dpif->user_features >>>>>> & OVS_DP_F_TC_RECIRC_SHARING); >>>>>> err = netdev_flow_put(dev, &match, >>>>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c >>>>>> index a3eee8df3..57385597a 100644 >>>>>> --- a/lib/netdev-offload-tc.c >>>>>> +++ b/lib/netdev-offload-tc.c >>>>>> @@ -1389,9 +1389,11 @@ static int >>>>>> parse_put_flow_set_action(struct tc_flower *flower, struct tc_action >>>>>> *action, >>>>>> const struct nlattr *set, size_t set_len) >>>>>> { >>>>>> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); >>>>>> const struct nlattr *tunnel; >>>>>> const struct nlattr *tun_attr; >>>>>> size_t tun_left, tunnel_len; >>>>>> + bool attr_csum_present; >>>>>> if (nl_attr_type(set) == OVS_KEY_ATTR_MPLS) { >>>>>> return parse_mpls_set_action(flower, action, set); >>>>>> @@ -1405,6 +1407,7 @@ parse_put_flow_set_action(struct tc_flower >>>>>> *flower, struct tc_action *action, >>>>>> tunnel = nl_attr_get(set); >>>>>> tunnel_len = nl_attr_get_size(set); >>>>>> + attr_csum_present = false; >>>>>> action->type = TC_ACT_ENCAP; >>>>>> action->encap.id_present = false; >>>>>> flower->action_count++; >>>>>> @@ -1431,6 +1434,19 @@ parse_put_flow_set_action(struct tc_flower >>>>>> *flower, struct tc_action *action, >>>>>> action->encap.ttl = nl_attr_get_u8(tun_attr); >>>>>> } >>>>>> break; >>>>>> + case OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT: { >>>>>> + /* XXX: This is wrong! We're ignoring the DF flag >>>>>> configuration >>>>>> + * requested by the user. However, TC for now has no way >>>>>> to pass >>>>>> + * that flag and it is set by default, meaning tunnel >>>>>> offloading >>>>>> + * will not work if 'options:df_default=false' is not set. >>>>>> + * Keeping incorrect behavior for now. */ >>>>>> + } >>>>>> + break; >>>>>> + case OVS_TUNNEL_KEY_ATTR_CSUM: { >>>>>> + attr_csum_present = true; >>>>>> + action->encap.no_csum = !nl_attr_get_u8(tun_attr); >>>>>> + } >>>>>> + break; >>>>>> case OVS_TUNNEL_KEY_ATTR_IPV6_SRC: { >>>>>> action->encap.ipv6.ipv6_src = >>>>>> nl_attr_get_in6_addr(tun_attr); >>>>>> @@ -1455,9 +1471,17 @@ parse_put_flow_set_action(struct tc_flower >>>>>> *flower, struct tc_action *action, >>>>>> action->encap.data.present.len = >>>>>> nl_attr_get_size(tun_attr); >>>>>> } >>>>>> break; >>>>>> + default: >>>>>> + VLOG_DBG_RL(&rl, "unsupported tunnel key attribute %d", >>>>>> + nl_attr_type(tun_attr)); >>>>>> + return EOPNOTSUPP; >>>>>> } >>>>>> } >>>>>> + if (!attr_csum_present) { >>>>>> + action->encap.no_csum = 1; >>>>>> + } >>>>>> + >>>>>> return 0; >>>>>> } >>>>>> @@ -1553,7 +1577,7 @@ test_key_and_mask(struct match *match) >>>>>> static void >>>>>> flower_match_to_tun_opt(struct tc_flower *flower, const struct >>>>>> flow_tnl *tnl, >>>>>> - const struct flow_tnl *tnl_mask) >>>>>> + struct flow_tnl *tnl_mask) >>>>>> { >>>>>> struct geneve_opt *opt, *opt_mask; >>>>>> int len, cnt = 0; >>>>>> @@ -1579,6 +1603,10 @@ flower_match_to_tun_opt(struct tc_flower *flower, >>>>>> const struct flow_tnl *tnl, >>>>>> /* Copying from the key and not from the mask, since in the >>>>>> 'flower' >>>>>> * the length for a mask is not a mask, but the actual length. */ >>>>>> flower->mask.tunnel.metadata.present.len = >>>>>> tnl->metadata.present.len; >>>>>> + >>>>>> + memset(tnl_mask->metadata.opts.gnv, 0, tnl->metadata.present.len); >>>>>> + memset(&tnl_mask->metadata.present.len, 0, >>>>>> + sizeof tnl_mask->metadata.present.len); >>>>>> } >>>>>> static void >>>>>> @@ -1867,10 +1895,6 @@ netdev_tc_parse_nl_actions(struct netdev *netdev, >>>>>> struct tc_flower *flower, >>>>>> if (err) { >>>>>> return err; >>>>>> } >>>>>> - if (action->type == TC_ACT_ENCAP) { >>>>>> - action->encap.tp_dst = info->tp_dst_port; >>>>>> - action->encap.no_csum = !info->tunnel_csum_on; >>>>>> - } >>>>>> } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SET_MASKED) { >>>>>> const struct nlattr *set = nl_attr_get(nla); >>>>>> const size_t set_len = nl_attr_get_size(nla); >>>>>> @@ -1946,7 +1970,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct >>>>>> match *match, >>>>>> const struct flow *key = &match->flow; >>>>>> struct flow *mask = &match->wc.masks; >>>>>> const struct flow_tnl *tnl = &match->flow.tunnel; >>>>>> - const struct flow_tnl *tnl_mask = &mask->tunnel; >>>>>> + struct flow_tnl *tnl_mask = &mask->tunnel; >>>>>> bool recirc_act = false; >>>>>> uint32_t block_id = 0; >>>>>> struct tcf_id id; >>>>>> @@ -1984,6 +2008,7 @@ netdev_tc_flow_put(struct netdev *netdev, struct >>>>>> match *match, >>>>>> flower.key.tunnel.ttl = tnl->ip_ttl; >>>>>> flower.key.tunnel.tp_src = tnl->tp_src; >>>>>> flower.key.tunnel.tp_dst = tnl->tp_dst; >>>>>> + >>>>>> flower.mask.tunnel.ipv4.ipv4_src = tnl_mask->ip_src; >>>>>> flower.mask.tunnel.ipv4.ipv4_dst = tnl_mask->ip_dst; >>>>>> flower.mask.tunnel.ipv6.ipv6_src = tnl_mask->ipv6_src; >>>>>> @@ -1996,10 +2021,32 @@ netdev_tc_flow_put(struct netdev *netdev, struct >>>>>> match *match, >>>>>> * Degrading the flow down to exact match for now as a >>>>>> workaround. */ >>>>>> flower.mask.tunnel.tp_dst = OVS_BE16_MAX; >>>>>> flower.mask.tunnel.id = (tnl->flags & FLOW_TNL_F_KEY) ? >>>>>> tnl_mask->tun_id : 0; >>>>>> + >>>>>> + memset(&tnl_mask->ip_src, 0, sizeof tnl_mask->ip_src); >>>>>> + memset(&tnl_mask->ip_dst, 0, sizeof tnl_mask->ip_dst); >>>>>> + memset(&tnl_mask->ipv6_src, 0, sizeof tnl_mask->ipv6_src); >>>>>> + memset(&tnl_mask->ipv6_dst, 0, sizeof tnl_mask->ipv6_dst); >>>>>> + memset(&tnl_mask->ip_tos, 0, sizeof tnl_mask->ip_tos); >>>>>> + memset(&tnl_mask->ip_ttl, 0, sizeof tnl_mask->ip_ttl); >>>>>> + memset(&tnl_mask->tp_dst, 0, sizeof tnl_mask->tp_dst); >>>>>> + >>>>>> + memset(&tnl_mask->tun_id, 0, sizeof tnl_mask->tun_id); >>>>>> + tnl_mask->flags &= ~FLOW_TNL_F_KEY; >>>>>> + >>>>>> + /* XXX: This is wrong! We're ignoring DF and CSUM flags >>>>>> configuration >>>>>> + * requested by the user. However, TC for now has no way to >>>>>> pass >>>>>> + * these flags in a flower key and their masks are set by >>>>>> default, >>>>>> + * meaning tunnel offloading will not work at all if not >>>>>> cleared. >>>>>> + * Keeping incorrect behavior for now. */ >>>>>> + tnl_mask->flags &= ~(FLOW_TNL_F_DONT_FRAGMENT | >>>>>> FLOW_TNL_F_CSUM); >>>>>> + >>>>>> flower_match_to_tun_opt(&flower, tnl, tnl_mask); >>>>>> flower.tunnel = true; >>>>>> + } else { >>>>>> + /* There is no tunnel metadata to match on, but there could be >>>>>> some >>>>>> + * mask bits set due to flow translation artifacts. Clear >>>>>> them. */ >>>>>> + memset(&mask->tunnel, 0, sizeof mask->tunnel); >>>>>> } >>>>>> - memset(&mask->tunnel, 0, sizeof mask->tunnel); >>>>>> flower.key.eth_type = key->dl_type; >>>>>> flower.mask.eth_type = mask->dl_type; >>>>>> diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h >>>>>> index 249a3102a..180d3f95f 100644 >>>>>> --- a/lib/netdev-offload.h >>>>>> +++ b/lib/netdev-offload.h >>>>>> @@ -66,9 +66,6 @@ struct netdev_flow_dump { >>>>>> /* Flow offloading. */ >>>>>> struct offload_info { >>>>>> - ovs_be16 tp_dst_port; /* Destination port for tunnel in SET action >>>>>> */ >>>>>> - uint8_t tunnel_csum_on; /* Tunnel header with checksum */ >>>>>> - >>>>>> bool recirc_id_shared_with_tc; /* Indicates whever tc chains >>>>>> will be in >>>>>> * sync with datapath recirc ids. >>>>>> */ >>>>> >>>>> Reviewed-by: Roi Dayan <r...@nvidia.com> >>>> >>>> Hi, >>>> >>>> Maybe replied too soon. I tested manually vxlan but also added >>>> the patches for our verification team to run their regression >>>> at night and they failed using geneve tunnel for mask not being >>>> zero. >>>> >>>> Checking it now. >>>> >>>> Thanks, >>>> Roi >>> >>> ok found the issue. >>> the patch is missing removing the FLOW_TNL_F_UDPIF flag when parsing >>> tunnel options. >>> this fixes the issue. >>> our verification team didnt encounter any other issue. >>> >>> >>> @@ -1621,6 +1633,7 @@ flower_match_to_tun_opt(struct tc_flower *flower, >>> const struct flow_tnl *tnl, >>> memset(tnl_mask->metadata.opts.gnv, 0, tnl->metadata.present.len); >>> memset(&tnl_mask->metadata.present.len, 0, >>> sizeof tnl_mask->metadata.present.len); >>> + tnl_mask->flags &= ~FLOW_TNL_F_UDPIF; >>> } >>> >>> >> >> Good point. It looks like we also need to reject offload if >> that flag is not set, right? This way we will actually consume >> it instead of clearing blindly. >> >> Something like this: >> >> static void >> flower_match_to_tun_opt(struct tc_flower *flower, const struct flow_tnl *tnl, >> struct flow_tnl *tnl_mask) >> { >> struct geneve_opt *opt, *opt_mask; >> int len, cnt = 0; >> >> if (tnl->flags & (tnl_mask->flags & FLOW_TNL_F_UDPIF)) { >> tnl_mask->flags &= ~FLOW_TNL_F_UDPIF; >> } else { >> return; >> } >> >> ... >> } >> >> What do you think? >> >> Best regards, Ilya Maximets. > > hmm to catch if flag was mistakenly set but no options? > it looks strange. I see flower_tun_opt_to_match() set it when > parsing flower back to ovs match and being called > if flower->mask.tunnel.metadata.present.len is not 0. > can we do the same here? >
Thanks for pointing that out. There are couple of more issues here though regarding a match on zero-length options. I don't think these are handled correctly in the current code. Please, take a look at v3, I tried to re-work that part a little. Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev