>-----Original Message-----
>From: Ilya Maximets <[email protected]>
>Sent: Thursday, 21 August 2025 13:32
>To: Reuven Plevinsky <[email protected]>; [email protected]
>Cc: [email protected]; Eli Britstein <[email protected]>; Maor Dickman
><[email protected]>
>Subject: Re: [ovs-dev] [PATCH v2 1/1] ofproto-dpif-xlate: Clear tunnel wildcard
>bits properly.
>
>External email: Use caution opening links or attachments
>
>
>On 8/19/25 11:01 AM, Reuven Plevinsky wrote:
>> On 06/08/2025 14:26, Ilya Maximets wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 8/4/25 4:29 PM, Reuven Plevinsky via dev wrote:
>>>> For example, consider the following OpenFlow rule:
>>>> - IPv6 -> IPv4 tunnel translation:
>>>> add-flow br-int 'in_port=vxlan_br-int_0, \
>>>> actions=set_field:8.8.8.7->tun_src, \
>>>> set_field:8.8.8.8->tun_dst,set_field:43->tun_id,geneve_br-int_1'
>>>>
>>>> As any set_field action, flow wildcards are set to the tun_src/dst.
>>>> The offload layer fails if not all matches are handled ("consumed").
>>>> Clear the irrelevant wildcards.
>>>>
>>>> Signed-off-by: Reuven Plevinsky <[email protected]>
>>>> ---
>>>> ofproto/ofproto-dpif-xlate.c | 17 +++++++++++++
>>>> tests/tunnel-push-pop.at | 48
>++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 65 insertions(+)
>>>>
>>>> diff --git a/ofproto/ofproto-dpif-xlate.c
>>>> b/ofproto/ofproto-dpif-xlate.c index 2c8197fb7307..e2ba7e6b0250
>>>> 100644
>>>> --- a/ofproto/ofproto-dpif-xlate.c
>>>> +++ b/ofproto/ofproto-dpif-xlate.c
>>>> @@ -8078,6 +8078,20 @@ xlate_wc_init(struct xlate_ctx *ctx)
>>>> tnl_wc_init(&ctx->xin->flow, ctx->wc);
>>>> }
>>>>
>>>> +static void
>>>> +clear_tunnel_wc(const struct flow_tnl *spec, struct flow_tnl *mask)
>>>> +{
>>>> + if (!spec->ip_src && !spec->ip_dst) {
>>>> + mask->ip_src = 0;
>>>> + mask->ip_dst = 0;
>>>> + }
>>>> + if (!ipv6_addr_is_set(&spec->ipv6_src) &&
>>>> + !ipv6_addr_is_set(&spec->ipv6_dst)) {
>>>> + mask->ipv6_src = in6addr_any;
>>>> + mask->ipv6_dst = in6addr_any;
>>>> + }
>>>> +}
>>>> +
>>>> static void
>>>> xlate_wc_finish(struct xlate_ctx *ctx)
>>>> {
>>>> @@ -8152,6 +8166,9 @@ xlate_wc_finish(struct xlate_ctx *ctx)
>>>> if (!flow_tnl_dst_is_set(&ctx->xin->upcall_flow->tunnel)) {
>>>> memset(&ctx->wc->masks.tunnel, 0, sizeof ctx->wc->masks.tunnel);
>>>> }
>>>> + /* Both IPv4/IPv6 tunnel wc may be set because of set_fields action.
>>>> + * Clear the irrelevant one. */
>>>> + clear_tunnel_wc(&ctx->xin->upcall_flow->tunnel,
>>>> + &ctx->wc->masks.tunnel);
>>>> }
>>>>
>>>> /* This will tweak the odp actions generated. For now, it will:
>>>> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
>>>> index 64c41460bbb2..9b46d249fe9f 100644
>>>> --- a/tests/tunnel-push-pop.at
>>>> +++ b/tests/tunnel-push-pop.at
>>>> @@ -1440,3 +1440,51 @@ AT_CHECK([tail -1 stdout], [0],
>>>>
>>>> OVS_VSWITCHD_STOP
>>>> AT_CLEANUP
>>>> +
>>>> +AT_SETUP([tunnel - decap and re-encap with outer IP version
>>>> +conversion]) OVS_VSWITCHD_START([add-port br0 p0 -- set Interface
>>>> +p0 type=dummy]) AT_CHECK([ovs-vsctl add-br int-br -- set bridge
>>>> +int-br datapath_type=dummy], [0]) AT_CHECK([ovs-vsctl add-port int-br
>t1 -- set Interface t1 type=vxlan \
>>>> + options:key=flow options:remote_ip=flow \
>>>> + -- add-port int-br t2 -- set Interface t2 type=geneve
>>>> \
>>>> + options:key=flow options:remote_ip=flow \
>>>> + ], [0])
>>>> +
>>>> +AT_CHECK([ovs-ofctl add-flow br0 action=normal]) dnl Setup dummy
>>>> +interface IP addresses.
>>>> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 1.1.1.1/24], [0], [OK
>>>> +])
>>>> +AT_CHECK([ovs-appctl netdev-dummy/ip6addr br0 2001:cafe::1/24],
>>>> +[0], [OK
>>>> +])
>>>> +dnl Checking that a local routes for added IPs were successfully
>>>> installed.
>>>> +AT_CHECK([ovs-appctl ovs/route/show | grep Cached | sort], [0],
>>>> +[dnl
>>>> +Cached: 1.1.1.0/24 dev br0 SRC 1.1.1.1 local
>>>> +Cached: 2001:ca00::/24 dev br0 SRC 2001:cafe::1 local
>>>> +])
>>>> +dnl Add entries
>>>> +AT_CHECK([ovs-appctl tnl/neigh/set br0 1.1.1.2 aa:bb:cc:00:00:01],
>>>> +[0], [OK
>>>> +])
>>>> +AT_CHECK([ovs-appctl tnl/neigh/set br0 2001:cafe::2
>>>> +aa:bb:cc:00:00:01], [0], [OK
>>>> +])
>>>> +AT_CHECK([ovs-appctl tnl/neigh/show | grep br0 | sort], [0], [dnl
>>>> +1.1.1.2 aa:bb:cc:00:00:01 br0
>>>> +2001:cafe::2 aa:bb:cc:00:00:01 br0
>>>> +])
>>>> +
>>>> +AT_CHECK([ovs-ofctl add-flow int-br
>>>> +'in_port=t1,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun
>>>> +_id=0x7a,actions=set_field:1.1.1.1->tun_src,set_field:1.1.1.2->tun_
>>>> +dst,set_field:0x7b->tun_id,t2']) AT_CHECK([ovs-ofctl add-flow
>>>> +int-br
>>>> +'in_port=t2,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_id=0x7b,actions=set
>>>> +_field:2001:cafe::1->tun_ipv6_src,set_field:2001:cafe::2->tun_ipv6_
>>>> +dst,set_field:0x7a->tun_id,set_field:0.0.0.0->tun_src,set_field:0.0
>>>> +.0.0->tun_dst,t1'])
>>>> +
>>>> +AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy
>>>> +'tunnel(tun_id=0x7a,ipv6_src=2001:cafe::1,ipv6_dst=2001:cafe::2,ttl
>>>> +=64),in_port(4789)'], [0], [stdout]) AT_CHECK([tail -2 stdout],
>>>> +[0],
>>>> + [Megaflow:
>>>> +recirc_id=0,eth,tun_id=0x7a,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=
>>>> +2001:cafe::2,tun_tos=0,tun_flags=-df-csum+key,in_port=t1,dl_type=0x
>>>> +05ff Datapath actions:
>>>> +tnl_push(tnl_port(genev_sys_6081),header(size=50,type=5,eth(dst=aa:
>>>> +bb:cc:00:00:01,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.1
>>>> +.1,dst=1.1.1.2,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=608
>>>> +1,csum=0x0),geneve(vni=0x7b)),out_port(br0)),p0
>>>> +])
>>>> +
>>>> +AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy
>>>> +'tunnel(tun_id=0x7b,src=1.1.1.1,dst=1.1.1.2,ttl=64),in_port(6081)']
>>>> +, [0], [stdout]) AT_CHECK([tail -2 stdout], [0],
>>>> + [Megaflow:
>>>> +recirc_id=0,eth,tun_id=0x7b,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos
>>>> +=0,tun_flags=-df-csum+key,in_port=t2,dl_type=0x05ff
>>>> +Datapath actions:
>>>> +tnl_push(tnl_port(vxlan_sys_4789),header(size=70,type=4,eth(dst=aa:
>>>> +bb:cc:00:00:01,src=aa:55:aa:55:00:00,dl_type=0x86dd),ipv6(src=2001:
>>>> +cafe::1,dst=2001:cafe::2,label=0,proto=17,tclass=0x0,hlimit=64),udp
>>>> +(src=0,dst=4789,csum=0xffff),vxlan(flags=0x8000000,vni=0x7a)),out_p
>>>> +ort(br0)),p0
>>>> +])
>>>> +
>>>> +OVS_VSWITCHD_STOP
>>>> +AT_CLEANUP
>>> Hi, Reuven. Thanks for the update and the test. I see what you're
>>> trying to achive now.
>>>
>>> However, I don't think this is a right approach as we can't actually
>>> clear tunnel metadata at xlate_wc_finish() stage. This is because we
>>> must have prerequisites unwildcarded in order to clear the masks that
>>> we think are unnecessary. And tunnel metadata doesn't have any
>>> prerequisites, so there is no way to tell if it is necessary to match
>>> on or not. The main problem is a match on empty metadata that the
>>> user can add into OpenFlow rules. We must never clear those as it
>>> may be necessary to distinguish tunneled and non-tunneled packets.
>>> E.g. match on all-zero tun_dst signifies that the packet is not
>>> coming from an ipv4 tunnel. Note that input port match is not enough
>>> as a prerequisite as controller can inject packets from any port.
>>>
>>> For example, if you add the following change to your test:
>>>
>>> diff --git a/tests/tunnel-push-pop.at b/tests/tunnel-push-pop.at
>>> index 9b46d249f..1d19e7e5b 100644
>>> --- a/tests/tunnel-push-pop.at
>>> +++ b/tests/tunnel-push-pop.at
>>> @@ -1473,6 +1473,13 @@ AT_CHECK([ovs-appctl tnl/neigh/show | grep
>br0
>>> | sort], [0], [dnl
>>>
>>> AT_CHECK([ovs-ofctl add-flow int-br
>'in_port=t1,tun_ipv6_src=2001:cafe::1,tun_ipv6_dst=2001:cafe::2,tun_id=0x7a
>,actions=set_field:1.1.1.1->tun_src,set_field:1.1.1.2->tun_dst,set_field:0x7b-
>>tun_id,t2'])
>>> AT_CHECK([ovs-ofctl add-flow int-br
>>> 'in_port=t2,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_id=0x7b,actions=set_f
>>> ield:2001:cafe::1->tun_ipv6_src,set_field:2001:cafe::2->tun_ipv6_dst,
>>> set_field:0x7a->tun_id,set_field:0.0.0.0->tun_src,set_field:0.0.0.0->
>>> tun_dst,t1'])
>>> +AT_CHECK([ovs-ofctl add-flow int-br
>>> +'tun_dst=0.0.0.0,tun_ipv6_src=::,actions=drop'])
>>> +
>>> +AT_CHECK([ovs-appctl ofproto/trace --names int-br in_port=t1], [0],
>>> +[stdout]) AT_CHECK([tail -2 stdout], [0],
>>> + [Megaflow:
>>> +recirc_id=0,eth,tun_dst=0.0.0.0,tun_ipv6_src=::,in_port=t1,dl_type=0
>>> +x0000
>>> +Datapath actions: drop
>>> +])
>>>
>>> AT_CHECK([ovs-appctl ofproto/trace --names ovs-dummy
>'tunnel(tun_id=0x7a,ipv6_src=2001:cafe::1,ipv6_dst=2001:cafe::2,ttl=64),in_p
>ort(4789)'], [0], [stdout])
>>> AT_CHECK([tail -2 stdout], [0],
>>> ---
>>>
>>> The test will fail and the actual datapath flow will look like this:
>>>
>>> Megaflow: recirc_id=0,eth,in_port=t1,dl_type=0x0000
>>>
>>> Such a flow will drop all the traffic and not just non-tunneled one
>>> as it was intended by the OpenFlow rules.
>>>
>>> This issue actually already exist for a while and need to also revert
>>> the previous commit:
>>> d046453b56a9 ("ofproto-dpif-xlate: Clear tunnel wc bits if
>>> original packet is non-tunnel.") Though, I think, we need a better
>>> solution for this before reverting as conditions where it would be a problem
>are kind of unlikely.
>>>
>>> A proper solution would be to avoid unwildcarding these fields while
>>> setting the metadata through varois OpenFlow actions (set_field,
>>> load, stack pop, etc.) but preserve the masks coming from the classifier.
>>> The reason why we can actually avoid unwildcarding while setting the
>>> fileds is that unlike other fileds, tunnel metadata is not a subject
>>> for the optimization where we skip datapath actions if the value is
>>> already the same. So, there is no real reason to unwildcard them.
>>>
>>> Do you want to try and work on this kind of change?
>>>
>>> BTW, I didn't test, but it seems like IPV4/IPv6 mix is a problem not
>>> only for the hardware offloading code, but also for the kernel
>>> datapath in general as it will reject to install the flow with mixed
>>> metadata types. Userspace works fine in that case though. So,
>>> technicaly, we also need to probe the datapath implementation for
>>> support and slowpath such traffic, if it doesn't, i.e. set the SLOW_MATCH
>flag.
>>>
>>> Best regards, Ilya Maximets.
>>
>> Hi Ilya,
>>
>> Thanks a lot for your review and comments. Since this issue is
>> specific to offload, and is not supported by the current upstream DPDK
>> implementation (while it is supported by our DOCA implementation),
>
>It should be supported by TC offload implementation, AFAIU. It also may be a
>problem for the kernel datapath without offload as I mentioned above.
>
>> we’ve
>> decided to take a different approach and handle it within the DOCA
>> offload provider instead. For now, we will abandon this change.
>
>The same as we can't simply clear these bits at the wc_finish stage, the
>datapath or the offload implementaion can't make this decision either without
>breaking user-specified OpenFlow logic.
Hi Ilya,
Could you please elaborate on your concern about a potential breakage? Could
you please provide an example?
I'd like to refer you to 2 other commits that are doing something like what we
want to do:
262eded5fbd7 ("netdev-offload-tc: Fix ignoring unknown tunnel keys.")
0ef70536eb41 ("netdev-offload-dpdk: Support vxlan encap offload with load
actions.")
Thanks, Eli
>
>Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev