On 8/19/25 11:01 AM, Reuven Plevinsky wrote: > On 06/08/2025 14:26, Ilya Maximets wrote: >> External email: Use caution opening links or attachments >> >> >> On 8/4/25 4:29 PM, Reuven Plevinsky via dev wrote: >>> For example, consider the following OpenFlow rule: >>> - IPv6 -> IPv4 tunnel translation: >>> add-flow br-int 'in_port=vxlan_br-int_0, \ >>> actions=set_field:8.8.8.7->tun_src, \ >>> set_field:8.8.8.8->tun_dst,set_field:43->tun_id,geneve_br-int_1' >>> >>> As any set_field action, flow wildcards are set to the >>> tun_src/dst. The offload layer fails if not all matches are >>> handled ("consumed"). Clear the irrelevant wildcards. >>> >>> Signed-off-by: Reuven Plevinsky <rplevin...@nvidia.com> >>> --- >>> ofproto/ofproto-dpif-xlate.c | 17 +++++++++++++ >>> tests/tunnel-push-pop.at | 48 ++++++++++++++++++++++++++++++++++++ >>> 2 files changed, 65 insertions(+) >>> >>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c >>> index 2c8197fb7307..e2ba7e6b0250 100644 >>> --- a/ofproto/ofproto-dpif-xlate.c >>> +++ b/ofproto/ofproto-dpif-xlate.c >>> @@ -8078,6 +8078,20 @@ xlate_wc_init(struct xlate_ctx *ctx) >>> tnl_wc_init(&ctx->xin->flow, ctx->wc); >>> } >>> >>> +static void >>> +clear_tunnel_wc(const struct flow_tnl *spec, struct flow_tnl *mask) >>> +{ >>> + if (!spec->ip_src && !spec->ip_dst) { >>> + mask->ip_src = 0; >>> + mask->ip_dst = 0; >>> + } >>> + if (!ipv6_addr_is_set(&spec->ipv6_src) && >>> + !ipv6_addr_is_set(&spec->ipv6_dst)) { >>> + mask->ipv6_src = in6addr_any; >>> + mask->ipv6_dst = in6addr_any; >>> + } >>> +} >>> + >>> static void >>> xlate_wc_finish(struct xlate_ctx *ctx) >>> { >>> @@ -8152,6 +8166,9 @@ xlate_wc_finish(struct xlate_ctx *ctx) >>> if (!flow_tnl_dst_is_set(&ctx->xin->upcall_flow->tunnel)) { >>> memset(&ctx->wc->masks.tunnel, 0, sizeof ctx->wc->masks.tunnel); >>> } >>> + /* Both IPv4/IPv6 tunnel wc may be set because of set_fields action. >>> + * Clear the irrelevant one. */ >>> + clear_tunnel_wc(&ctx->xin->upcall_flow->tunnel, >>> &ctx->wc->masks.tunnel); >>> } >>> >>> /* This will tweak the odp actions generated. For now, it will: >>> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at >>> index 64c41460bbb2..9b46d249fe9f 100644 >>> --- a/tests/tunnel-push-pop.at >>> +++ b/tests/tunnel-push-pop.at >>> @@ -1440,3 +1440,51 @@ AT_CHECK([tail -1 stdout], [0], >>> >>> OVS_VSWITCHD_STOP >>> AT_CLEANUP >>> + >>> +AT_SETUP([tunnel - decap and re-encap with outer IP version conversion]) >>> +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy]) >>> +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br >>> datapath_type=dummy], [0]) >>> +AT_CHECK([ovs-vsctl add-port int-br t1 -- set Interface t1 type=vxlan \ >>> + options:key=flow options:remote_ip=flow \ >>> + -- add-port int-br t2 -- set Interface t2 type=geneve \ >>> + options:key=flow options:remote_ip=flow \ >>> + ], [0]) >>> + >>> +AT_CHECK([ovs-ofctl add-flow br0 action=normal]) >>> +dnl Setup dummy interface IP addresses. >>> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.1.1/24], [0], [OK >>> +]) >>> +AT_CHECK([ovs-appctl netdev-dummy/ip6addr br0 2001:cafe::1/24], [0], [OK >>> +]) >>> +dnl Checking that a local routes for added IPs were successfully installed. >>> +AT_CHECK([ovs-appctl ovs/route/show | grep Cached | sort], [0], [dnl >>> +Cached: 1.1.1.0/24 dev br0 SRC 1.1.1.1 local >>> +Cached: 2001:ca00::/24 dev br0 SRC 2001:cafe::1 local >>> +]) >>> +dnl Add entries >>> +AT_CHECK([ovs-appctl tnl/neigh/set br0 1.1.1.2 aa:bb:cc:00:00:01], [0], [OK >>> +]) >>> +AT_CHECK([ovs-appctl tnl/neigh/set br0 2001:cafe::2 aa:bb:cc:00:00:01], >>> [0], [OK >>> +]) >>> +AT_CHECK([ovs-appctl tnl/neigh/show | grep br0 | sort], [0], [dnl >>> +1.1.1.2 aa:bb:cc:00:00:01 br0 >>> +2001:cafe::2 aa:bb:cc:00:00:01 br0 >>> +]) >>> + >>> +AT_CHECK([ovs-ofctl add-flow int-br >>> 'in_port=t1,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_id=0x7a,actions=set_field:1.1.1.1->tun_src,set_field:1.1.1.2->tun_dst,set_field:0x7b->tun_id,t2']) >>> +AT_CHECK([ovs-ofctl add-flow int-br >>> 'in_port=t2,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_id=0x7b,actions=set_field:2001:cafe::1->tun_ipv6_src,set_field:2001:cafe::2->tun_ipv6_dst,set_field:0x7a->tun_id,set_field:0.0.0.0->tun_src,set_field:0.0.0.0->tun_dst,t1']) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy >>> 'tunnel(tun_id=0x7a,ipv6_src=2001:cafe::1,ipv6_dst=2001:cafe::2,ttl=64),in_port(4789)'], >>> [0], [stdout]) >>> +AT_CHECK([tail -2 stdout], [0], >>> + [Megaflow: >>> recirc_id=0,eth,tun_id=0x7a,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_tos=0,tun_flags=-df-csum+key,in_port=t1,dl_type=0x05ff >>> +Datapath actions: >>> tnl_push(tnl_port(genev_sys_6081),header(size=50,type=5,eth(dst=aa:bb:cc:00:00:01,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.1.1,dst=1.1.1.2,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=6081,csum=0x0),geneve(vni=0x7b)),out_port(br0)),p0 >>> +]) >>> + >>> +AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy >>> 'tunnel(tun_id=0x7b,src=1.1.1.1,dst=1.1.1.2,ttl=64),in_port(6081)'], [0], >>> [stdout]) >>> +AT_CHECK([tail -2 stdout], [0], >>> + [Megaflow: >>> recirc_id=0,eth,tun_id=0x7b,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_flags=-df-csum+key,in_port=t2,dl_type=0x05ff >>> +Datapath actions: >>> tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,eth(dst=aa:bb:cc:00:00:01,src=aa:55:aa:55:00:00,dl_type=0x86dd),ipv6(src=2001:cafe::1,dst=2001:cafe::2,label=0,proto=17,tclass=0x0,hlimit=64),udp(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x7a)),out_port(br0)),p0 >>> +]) >>> + >>> +OVS_VSWITCHD_STOP >>> +AT_CLEANUP >> Hi, Reuven. Thanks for the update and the test. I see what you're >> trying to achive now. >> >> However, I don't think this is a right approach as we can't actually >> clear tunnel metadata at xlate_wc_finish() stage. This is because >> we must have prerequisites unwildcarded in order to clear the masks >> that we think are unnecessary. And tunnel metadata doesn't have >> any prerequisites, so there is no way to tell if it is necessary to >> match on or not. The main problem is a match on empty metadata that >> the user can add into OpenFlow rules. We must never clear those as >> it may be necessary to distinguish tunneled and non-tunneled packets. >> E.g. match on all-zero tun_dst signifies that the packet is not coming >> from an ipv4 tunnel. Note that input port match is not enough as a >> prerequisite as controller can inject packets from any port. >> >> For example, if you add the following change to your test: >> >> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at >> index 9b46d249f..1d19e7e5b 100644 >> --- a/tests/tunnel-push-pop.at >> +++ b/tests/tunnel-push-pop.at >> @@ -1473,6 +1473,13 @@ AT_CHECK([ovs-appctl tnl/neigh/show | grep br0 | >> sort], [0], [dnl >> >> AT_CHECK([ovs-ofctl add-flow int-br >> 'in_port=t1,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_id=0x7a,actions=set_field:1.1.1.1->tun_src,set_field:1.1.1.2->tun_dst,set_field:0x7b->tun_id,t2']) >> AT_CHECK([ovs-ofctl add-flow int-br >> 'in_port=t2,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_id=0x7b,actions=set_field:2001:cafe::1->tun_ipv6_src,set_field:2001:cafe::2->tun_ipv6_dst,set_field:0x7a->tun_id,set_field:0.0.0.0->tun_src,set_field:0.0.0.0->tun_dst,t1']) >> +AT_CHECK([ovs-ofctl add-flow int-br >> 'tun_dst=0.0.0.0,tun_ipv6_src=::,actions=drop']) >> + >> +AT_CHECK([ovs-appctl ofproto/trace --names int-br in_port=t1], [0], >> [stdout]) >> +AT_CHECK([tail -2 stdout], [0], >> + [Megaflow: >> recirc_id=0,eth,tun_dst=0.0.0.0,tun_ipv6_src=::,in_port=t1,dl_type=0x0000 >> +Datapath actions: drop >> +]) >> >> AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy >> 'tunnel(tun_id=0x7a,ipv6_src=2001:cafe::1,ipv6_dst=2001:cafe::2,ttl=64),in_port(4789)'], >> [0], [stdout]) >> AT_CHECK([tail -2 stdout], [0], >> --- >> >> The test will fail and the actual datapath flow will look like this: >> >> Megaflow: recirc_id=0,eth,in_port=t1,dl_type=0x0000 >> >> Such a flow will drop all the traffic and not just non-tunneled one >> as it was intended by the OpenFlow rules. >> >> This issue actually already exist for a while and need to also revert >> the previous commit: >> d046453b56a9 ("ofproto-dpif-xlate: Clear tunnel wc bits if original >> packet is non-tunnel.") >> Though, I think, we need a better solution for this before reverting >> as conditions where it would be a problem are kind of unlikely. >> >> A proper solution would be to avoid unwildcarding these fields while >> setting the metadata through varois OpenFlow actions (set_field, load, >> stack pop, etc.) but preserve the masks coming from the classifier. >> The reason why we can actually avoid unwildcarding while setting the >> fileds is that unlike other fileds, tunnel metadata is not a subject >> for the optimization where we skip datapath actions if the value is >> already the same. So, there is no real reason to unwildcard them. >> >> Do you want to try and work on this kind of change? >> >> BTW, I didn't test, but it seems like IPV4/IPv6 mix is a problem not >> only for the hardware offloading code, but also for the kernel datapath >> in general as it will reject to install the flow with mixed metadata >> types. Userspace works fine in that case though. So, technicaly, >> we also need to probe the datapath implementation for support and >> slowpath such traffic, if it doesn't, i.e. set the SLOW_MATCH flag. >> >> Best regards, Ilya Maximets. > > Hi Ilya, > > Thanks a lot for your review and comments. Since this issue is specific > to offload, and is not supported by the current upstream DPDK > implementation (while it is supported by our DOCA implementation),
It should be supported by TC offload implementation, AFAIU. It also may be a problem for the kernel datapath without offload as I mentioned above. > we’ve > decided to take a different approach and handle it within the DOCA > offload provider instead. For now, we will abandon this change. The same as we can't simply clear these bits at the wc_finish stage, the datapath or the offload implementaion can't make this decision either without breaking user-specified OpenFlow logic. Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev