On 16 Feb 2026, at 14:53, Eli Britstein wrote:
> On 16/02/2026 13:35, Eelco Chaudron wrote: >> External email: Use caution opening links or attachments >> >> >> This patch introduces a new API to the offload provider framework that >> allows hardware offload implementations to control UDP tunnel source port >> selection during tunnel encapsulation. >> >> Background and Motivation >> ========================== >> >> UDP-based tunnels (VXLAN, Geneve, etc.) use the UDP source port as an >> entropy field to enable ECMP load balancing across multiple paths in the >> network. The source port is typically calculated by hashing packet header >> fields (5-tuple or inner packet headers) to distribute flows across >> different paths. >> >> However, hardware offload implementations may require different approaches >> to source port calculation: >> >> 1. Hardware NICs may use different hash functions or hash inputs than >> the software datapath, which can lead to inconsistent flow distribution >> when mixing hardware and software paths. >> >> 2. Some hardware may support enhanced entropy mechanisms (e.g., using >> additional packet fields or hardware-specific hash engines) that provide >> better load distribution than the default software implementation. >> >> Design >> ====== >> >> This patch adds a new optional callback to the dpif_offload_class: >> >> bool (*netdev_udp_tnl_get_src_port)(const struct dpif_offload *, >> const struct netdev *egress_port, >> const struct netdev *tunnel_port, >> const struct dp_packet *packet, >> ovs_be16 *src_port); > > I have tried this patch. I configured this: > > br-phy0 > > - p0 (physical-port, wire) > > br-int0 > > - pf0vf0 > > - vxlan0 > > And an openflow to send pf0vf0->vxlan0. > > The flow that is generated is this: > > in_port(pf0vf0) > > actions: > tnl_push( > tnl_port(vxlan_sys_4789), > header( > size=50, > type=4, > eth(dst=10:70:fd:65:8b:f8,src=10:70:fd:87:54:e0,dl_type=0x0800), > ipv4(src=111.168.1.1,dst=111.168.1.2,proto=17,tos=0,ttl=64,frag=0x4000), > udp(src=0,dst=4789,csum=0x0), > vxlan(flags=0x8000000,vni=0x64)), > out_port(br-phy0)), > p0 > > In that new function egress_port=br-phy0 and tunnel_port=vxlan_sys_4789. > > none of them are physical ports, thus there is no HW to access by it. The > only relevant HW ports here are pf0vf0 (origin) and p0 (fwd). > > In another scenario, for example in dp-hash, the actions might be > tnl_push(),hash(),recirc(), so p0 will not appear in the flow at all, leaving > us with pf0vf0 only. > > Still, the packet has a "orig_in_port" (corresponding to pf0vf0 in my > example), so we can deduct the physical origin netdev. > > At least for mlx5 devices, this is the only device that matters here. Other > ones will remain "unused". I think it will be the case for other vendors as > well. > > one might be able to use/verify that tunnel_port type or something of that > kind, but "egress_port" is useless. i think it should be removed from the API. > > I suggest to add this orig-netdev deduction from the orig_in_port to the > infrastructure, unless you think each dpif-offload-XX provider should > implement it on its own? > > Regarding this, we already have this deduction before we call the > netdev_hw_post_process() from dpif_offload_netdev_hw_post_process(). Could we > take advantage of it? Hi Eli, Thanks for your feedback. The reason I was looking at the egress port (although the current implementation does not provide the physical egress port) was to handle the scenario where similar packets could ingress on different ports, some with and some without HW offload forwarding. If we based this on the actual egress port, the UDP source port would remain consistent. However, after reviewing the code again based on your comments, I noticed that even in the non-hardware-offload case, different ingress ports may yield different source ports, since the algorithm can potentially be based on the hardware’s RSS hash. With this in mind, it no longer makes sense to try to base this on the egress port. I will rework the patch to use the original ingress port and send out a v2 RFC for discussion. The non-RFC version will follow once the dependent “dpif-offload: Add PMD thread helpers and hardware offload simulation” series is merged. Cheers, Eelco [...] _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
