>-----Original Message----- >From: Finn, Emma <emma.f...@intel.com> >Sent: Thursday, August 13, 2020 11:47 AM >To: Eli Britstein <el...@nvidia.com>; Ilya Maximets <i.maxim...@ovn.org>; Xing, >Beilei <beilei.x...@intel.com>; Stokes, Ian <ian.sto...@intel.com>; Eli >Britstein ><el...@mellanox.com>; d...@openvswitch.org; Guo, Jia <jia....@intel.com> >Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching >HWOL for XL710 NIC > >> -----Original Message----- >> From: Eli Britstein <el...@nvidia.com> >> Sent: Tuesday 11 August 2020 16:49 >> To: Ilya Maximets <i.maxim...@ovn.org>; Finn, Emma >> <emma.f...@intel.com>; Xing, Beilei <beilei.x...@intel.com>; Stokes, >> Ian <ian.sto...@intel.com>; Eli Britstein <el...@mellanox.com>; >> d...@openvswitch.org; Guo, Jia <jia....@intel.com> >> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet >> matching HWOL for XL710 NIC >> >> >> >> >-----Original Message----- >> >From: Ilya Maximets <i.maxim...@ovn.org> >> >Sent: Tuesday, August 11, 2020 4:19 PM >> >To: Finn, Emma <emma.f...@intel.com>; Ilya Maximets >> ><i.maxim...@ovn.org>; Xing, Beilei <beilei.x...@intel.com>; Stokes, >> >Ian <ian.sto...@intel.com>; Eli Britstein <el...@mellanox.com>; >> >d...@openvswitch.org; Guo, Jia <jia....@intel.com> >> >Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet >> >matching HWOL for XL710 NIC >> > >> >On 8/11/20 3:12 PM, Finn, Emma wrote: >> >> >> >> >> >>> -----Original Message----- >> >>> From: Ilya Maximets <i.maxim...@ovn.org> >> >>> Sent: Tuesday 11 August 2020 11:02 >> >>> To: Xing, Beilei <beilei.x...@intel.com>; Stokes, Ian >> >>> <ian.sto...@intel.com>; Eli Britstein <el...@mellanox.com>; Finn, >> >>> Emma <emma.f...@intel.com>; d...@openvswitch.org; Guo, Jia >> >>> <jia....@intel.com> >> >>> Cc: i.maxim...@ovn.org >> >>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet >> >>> matching HWOL for XL710 NIC >> >>> >> >>> On 8/11/20 5:35 AM, Xing, Beilei wrote: >> >>>> >> >>>> >> >>>>> -----Original Message----- >> >>>>> From: Stokes, Ian <ian.sto...@intel.com> >> >>>>> Sent: Tuesday, August 11, 2020 4:52 AM >> >>>>> To: Eli Britstein <el...@mellanox.com>; Finn, Emma >> >>>>> <emma.f...@intel.com>; d...@openvswitch.org; Xing, Beilei >> >>>>> <beilei.x...@intel.com>; Guo, Jia <jia....@intel.com> >> >>>>> Cc: i.maxim...@ovn.org >> >>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken >> >>>>> ethernet matching HWOL for XL710 NIC >> >>>>> >> >>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote: >> >>>>>>> >> >>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote: >> >>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote: >> >>>>>>>>>> The following 2 commits introduced changes which caused a >> >>>>>>>>>> regression for XL710 devices and functionality ceases for >> >>>>>>>>>> partial offload as a result. >> >>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching >> >>>>>>>>>> for type >> >>>>>>>>>> only.") >> >>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of >> >>>>>> patterns >> >>>>>>>>>> function.") >> >>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in >> >>>>>>>>> intel PMD in dpdk, not in OVS. >> >>>>>>>> Hi Eli, >> >>>>>>>> >> >>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe >> >>>>>>>> was already in place and working for this usecase. As such >> >>>>>>>> it's removal introduced a regression from an OVS point of >> >>>>>>>> view between >> >>> the releases. >> >>>>>>>> >> >>>>>>>> We have had examples in the past where workarounds are >> >>> permissible >> >>>>>>>> if there is a clear path to fixing this in the future on the >> >>>>>>>> DPDK side (which is what I suggest here) (for example scatter >> >>>>>>>> gather support for MTUs in the past raised similar issue >> >>>>>>>> where we hand to handle specific NIC until the next DPDK LTS >release). >> >>>>>>>> >> >>>>>>>> So my suggestion is to re-instate the original workaround and >> >>>>>>>> remove its when fixed in the next DPDK LTS which supports the >> >>>>>>>> change for i40e at the PMD layer or if it's backported to the >> >>>>>>>> next >> >>>>>>>> 19.11 stable release which would be validated for use with OVS. >> >>>>>>> >> >>>>>>> Hi, >> >>>>>>> >> >>>>>>> There was a bug with this WA. >> >>>>>>> >> >>>>>>> Please see >> >>>>>>> >> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25 >> >>>>>>> 2 >> >>>>>>> >> >Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026 >> >>>>>>> 6- >> >>&data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08 >> >>>>>>> >> >>d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C63732 >> 7 >> >487 >> >>>>>>> >> >>263661244&sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69% >> 2 >> >B9kb >> >>>>>>> NM%3D&reserved=0 >> >>>>>> 28632-1-git-send-email-jackerd...@gmail.com/. >> >>> >> >>> I'm worried about this bug. Current version of a workaround is >> >>> not correct from the OVS point of view since it wildcards dl_type >> >>> during offloading that is not expected by higher layers, causing >> >>> HW flow being much more wide than requested. In case we going to >> >>> have this workaround for xl710, we should have another workaround >> >>> for this >> issue too. >> >>> >> >>> >> >>>>>>> >> >>>>>>> Is it possible to address it in DPDK instead of reverting in >> >>>>>>> OVS and later re-reverting? >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> >> >>>>>>> Eli >> >>>>>>> >> >>>>>>>> >> >>>>>>>> I've included the i40e DPDK maintainers here for their >> >>>>>>>> thoughts >> also. >> >>>>>>>> >> >>>>>>>> @Beilei/Jia Is this something we can look at to introduce in >> >>>>>>>> the i40e PMD? >> >>>>>> >> >>>>>> Hello, >> >>>>>> >> >>>>>> Please take a look at: >> >>>>>> >> >>>>>> >> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F >> >>>>>> m >> >>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020- >> >May%2F166272.html&data=0 >> >>>>>> >> >>2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9% >> 7 >> >C4 >> >>>>>> >> >>3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244 >& >> a >> >mp;s >> >>>>>> >> >data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&reserv >> ed= >> >0 >> >>>>>> >> >>>>>> For MLX it was only an optimization. For i40e something similar >> >>>>>> may be a workaround for this issue. >> >>>>> >> >>>>> Thanks for this Eli, let me sync with Beilei on this. >> >>>>> >> >>>>> If it's something we can resolve in the PMD then I think we can >> >>>>> add an errata or known issue for the 2.14 release (and possibly >> >>>>> the >> >>>>> 2.13 release as I think the same issue is present there). >> >>>>> >> >>>>> If it was fixed in the future we could then remove the issue >> >>>>> notice.> >> >>>> >> >>>> Hi, >> >>>> >> >>>> According to the OVS patch and mlx DPDK patch, is the requirement >> >>>> to support rte flow pattern like 'pattern IPv4 / UDP src is 32 / >> >>>> end', no need to use ''pattern ETH / IPv4 / UDP src is 32 / end '? >> >>>> please correct me if I'm wrong. >> >>>> >> >>>> If yes, could you please tell me how OVS adds a flow which >> >>>> doesn’t include ETH item? I'm not very familiar with OVS, I run >> >>>> some simple commands before, and find eth will always exist. E.g: >> >>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224 >> >>>> skb_priority(0),skb_mark(0),\ >> >>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0 >> >>>> ), >> >>>> c >> >>>> t_ >> >>>> label(0),recirc_id(0),\ >> >>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba: >> >>>> 80 >> >>>> ,dst=ff:ff:ff:ff:ff:ff), >> >>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,to >> >>>> s= >> >>>> 0 >> >>>> x1 >> >>>> 0,ttl=128,frag=no), >> >>>> udp(src=68,dst=67), actions:drop >> >>> >> >>> Emma, Ian, could you please provide the pattern that fails to >> >>> offload on current OVS master so it will be easier for everyone to >> >>> understand the >> >issue. >> >>> It's not obvious how to construct the bad pattern by only looking >> >>> at the i40e pmd code. Also, please, enable debug logs and provide >> >>> the testpmd-style arguments constructed by OVS. >> >>> >> >>> It might be also good to have all this information in the commit message. >> >>> >> >> >> >> Following flow pattern is failing from logs: >> >> >> >> Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte >> >> flow eth pattern: >> >> Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800 >> >> Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff >> >> rte flow ipv4 pattern: >> >> Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0 >> >> Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0 >> >> rte flow count action: >> >> Count: shared=0, id=0 >> >> rte flow port-id action: >> >> Port-id: original=0, id=1 >> >> 2020-08- >> >11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR >> N| >> >dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.). >> Hi >> >> Maybe another workaround until fixed in i40e PMD is to mask out the >> ether_type match if there is IPv4 or IPv6. >> It means that patterns of: >> eth type is 0x0800 / ipv4 / ... >> will be >> eth / ipv4 / ... >> >> For IPv6 the same, but for other ether types no removal of the match. >> eth type is 0x1234 / >> kept the same. >> >> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an >> optimization, for XL710 it can be used as a workaround. >> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails >> .dpdk.org%2Farchives%2Fdev%2F2020- >May%2F166272.html&data=02%7C01%7 >> >Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d15 >72734 >> >0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&sdata=D2H >XS8Wo9Si >> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&reserved=0 >> >> A similar masking out of protocol type is done today for >> TCP/UDP/SCTP/ICMP. For example: >> /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */ >> if (next_proto_mask) { >> *next_proto_mask = 0; >> } >> Please take a look at >> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc >> >hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718.3 >&am >> >p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f6 >57fdd >> >%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016 >934&s >> >data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&reserved >=0 >> 8633-3-sriharsha.basavapa...@broadcom.com/ >> That posted commit suggest not to mask out the protocol type and let >> the PMDs do it if they wish. >> >> The proposed workaround (not tested): >> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c >> index de6101e4d..1af7bebcd 100644 >> --- a/lib/netdev-offload-dpdk.c >> +++ b/lib/netdev-offload-dpdk.c >> @@ -676,6 +676,7 @@ static int >> parse_flow_match(struct flow_patterns *patterns, >> struct match *match) { >> + uint8_t *next_eth_type_mask = NULL; >> uint8_t *next_proto_mask = NULL; >> struct flow *consumed_masks; >> uint8_t proto = 0; >> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns *patterns, >> consumed_masks->dl_type = 0; >> >> add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, >> mask); >> + next_eth_type_mask = &mask->type; >> } >> >> /* VLAN */ >> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns *patterns, >> if (match->flow.dl_type == htons(ETH_TYPE_IP)) { >> struct rte_flow_item_ipv4 *spec, *mask; >> >> + /* IPv4. No need to match ether type. */ >> + if (next_eth_type_mask) { >> + *next_eth_type_mask = 0; >> + } >> + >> spec = xzalloc(sizeof *spec); >> mask = xzalloc(sizeof *mask); >> >> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns *patterns, >> if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) { >> struct rte_flow_item_ipv6 *spec, *mask; >> >> + /* IPv6. No need to match ether type. */ >> + if (next_eth_type_mask) { >> + *next_eth_type_mask = 0; >> + } >> + >> spec = xzalloc(sizeof *spec); >> mask = xzalloc(sizeof *mask); >> >> >> >> >> >> Or from dump-flows : >> >> >> >> flow-dump from pmd on cpu core: 23 >> >> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0, >> >> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark( >> >> 0/ >> >> 0 >> >> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_typ >> >> e( >> >> n >> >> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00: >> >> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0. >> >> 0 >> >> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/0 >> >> ), packets:42575487, bytes:2554529220, used:0.000s, >> >> offloaded:partial, dp:ovs, actions:dpdk1, >> >> dp-extra-info:miniflow_bits(4,2) >> >> >> >> >> >> with regards to " provide the testpmd-style arguments constructed >> >> by >> OVS." >> >> Can you confirm what you mean? >> > >> >We changed the way of logging in a following commit: >> >d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow >> >create/destroy.") >> > >> >It's on master and branch-2.14. >> > >> >> > >Tested both of these patches and neither fixed the issue. > >The problem is that parse_flow_match() is setting an ethernet pattern that the >i40e PMD doesn't support. >Working Flow: >rte flow eth pattern: >Spec = null >Mask = null >rte flow ipv4 pattern: >Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0 >Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0 > >Broken Flow: >flow eth pattern: >Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800 >Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte flow ipv4 >pattern: >Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0 >Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0 > >I will continue to try rework the patches to fix this. > >Thanks, >Emma > Hi Emma,
My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I have prepared another one according to your inputs (also not tested). Please give it a try, and if works, finalize it with comments etc. diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index de6101e4d..f4b04a880 100644 --- a/lib/netdev-offload-dpdk.c +++ b/lib/netdev-offload-dpdk.c @@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions) actions->cnt = 0; } +/* write a comment this is a workaround... */ +#define NULL_ETH_MASK_IF_ZEROS \ + if (eth_mask && is_all_zeros(ð_mask->src, sizeof eth_mask->src) && \ + is_all_zeros(ð_mask->dst, sizeof eth_mask->dst)) { \ + patterns->items[0].mask = NULL; \ + free(eth_mask); \ + } + static int parse_flow_match(struct flow_patterns *patterns, struct match *match) { + struct rte_flow_item_eth *eth_mask = NULL; uint8_t *next_proto_mask = NULL; struct flow *consumed_masks; uint8_t proto = 0; @@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns *patterns, if (match->wc.masks.dl_type || !eth_addr_is_zero(match->wc.masks.dl_src) || !eth_addr_is_zero(match->wc.masks.dl_dst)) { - struct rte_flow_item_eth *spec, *mask; + struct rte_flow_item_eth *spec; spec = xzalloc(sizeof *spec); - mask = xzalloc(sizeof *mask); + eth_mask = xzalloc(sizeof *eth_mask); memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst); memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src); spec->type = match->flow.dl_type; - memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst); - memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src); - mask->type = match->wc.masks.dl_type; + memcpy(ð_mask->dst, &match->wc.masks.dl_dst, sizeof eth_mask->dst); + memcpy(ð_mask->src, &match->wc.masks.dl_src, sizeof eth_mask->src); + eth_mask->type = match->wc.masks.dl_type; memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks->dl_dst); memset(&consumed_masks->dl_src, 0, sizeof consumed_masks->dl_src); consumed_masks->dl_type = 0; - add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask); + add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, eth_mask); } /* VLAN */ @@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns, if (match->flow.dl_type == htons(ETH_TYPE_IP)) { struct rte_flow_item_ipv4 *spec, *mask; + NULL_ETH_MASK_IF_ZEROS; + spec = xzalloc(sizeof *spec); mask = xzalloc(sizeof *mask); @@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns, if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) { struct rte_flow_item_ipv6 *spec, *mask; + NULL_ETH_MASK_IF_ZEROS; + spec = xzalloc(sizeof *spec); mask = xzalloc(sizeof *mask); Thanks, Eli >> >> Thanks, >> >> Emma >> >> >> >>> Thanks. >> >>> >> >>> Best regards, Ilya Maximets. >> >>> >> >>>> >> >>>> Thanks, >> >>>> Beilei >> >>>> >> >>>>> >> >>>>> Regards >> >>>>> Ian >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Eli >> >>>>>> >> >>>>>>>> >> >>>>>>>> Regards >> >>>>>>>> Ian >> >>>>>>>> >> >>>>>>>>>> Fixed by partial reversion of these changes. >> >>>>>>>>>> >> >>>>>>>>>> Signed-off-by: Emma Finn<emma.f...@intel.com> >> >>>> >> >> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev