On 11/3/25 7:29 PM, Numan Siddique wrote: > On Mon, Nov 3, 2025 at 5:55 AM Ilya Maximets <[email protected]> wrote: >> >> On 11/1/25 7:23 AM, Numan Siddique wrote: >>> Hello OVS folks, >>> >>> In our deployments we are seeing a lot of datapath flow offload issues >>> with tc resulting in packets getting handled in the host and packet >>> drops. >>> >>> We recently observed such an issue and only restart of ovs-vswitchd fixed >>> it. >>> >>> I debugged a bit and found that all the datapath flows offloaded by >>> ovs-vswitchd to tc fails if the recirculation id is greater than >>> 268,435,455 (which is 0x0fffffff). >>> >>> We see the below error messages: >>> >>> -------------------------------------------------- >>> 2025-11-01T03:12:18.415Z|93221|netlink_socket(handler53)|DBG|nl_sock_recv__ >>> (Success): nl(len:692, type=2(error), flags=200[MATCH], seq=7af, >>> pid=3613179965 error(-22(Invalid argument), in-reply-to(nl(len:624, >>> type=44(family-defined), flags=409[REQUEST][ECHO][ATOMIC], seq=7af, >>> pid=3613179965)) >>> 2025-11-01T03:12:18.415Z|93222|netlink_socket(handler53)|DBG|received >>> NAK error=22 - Specified chain index exceeds upper limit >>> 2025-11-01T03:12:18.415Z|93223|dpif_netlink(handler53)|ERR|failed to >>> offload flow: Invalid argument: ovn-f3902a-0 >>> 2025-11-01T03:12:18.415Z|93224|dpif_netlink(handler53)|DBG|system@ovs-system: >>> put[create] ufid:e287c507-e111-44be-90dd-469c242cb873 >>> recirc_id(0x2660dc6d),dp_hash(0/0),skb_priority(0/0),tunnel(tun_id=0x915,src=10.32.35.9,dst=10.32.5.25,ttl=59/0,tp_src=34744/0,tp_dst=6081/0,geneve({class=0x102/0,type=0x80/0,len=4/0,0x79a041a/0}),flags(-df+csum+key)),in_port(6),skb_mark(0/0),ct_state(0x21/0x3f),ct_zone(0x17f5/0),ct_mark(0/0x1),ct_label(0/0),ct_tuple4(src=172.27.61.139/0.0.0.0,dst=172.27.58.113/0.0.0.0,proto=6/0,tp_src=49588/0,tp_dst=4240/0),eth(src=be:28:87:5d:2e:28,dst=fe:6c:ee:aa:33:be),eth_type(0x0800),ipv4(src=172.27.61.139,dst=172.27.58.113,proto=6,tos=0/0,ttl=64/0,frag=no),tcp(src=49588/0x8000,dst=4240/0xf800),tcp_flags(0/0), >>> actions:ct(commit,zone=6133,mark=0/0x1,nat(src)),20 >>> ------------------------------------------------------------------- >>> >>> I was able to reproduce the issue locally with OVS main and Fedora >>> kernel 6.16.10-200.fc42. I had to hack the code though. >>> >>> ---- >>> diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c >>> index f01468025..1d577d73b 100644 >>> --- a/ofproto/ofproto-dpif-rid.c >>> +++ b/ofproto/ofproto-dpif-rid.c >>> @@ -34,8 +34,7 @@ static struct ovs_list expiring OVS_GUARDED_BY(mutex) >>> static struct ovs_list expired OVS_GUARDED_BY(mutex) >>> = OVS_LIST_INITIALIZER(&expired); >>> >>> -static uint32_t next_id OVS_GUARDED_BY(mutex) = 1; /* Possible next free >>> id. */ >>> - >>> +static uint32_t next_id OVS_GUARDED_BY(mutex) = 0x0fffffff; /* >>> Possible next free id. */ >>> #define RECIRC_POOL_STATIC_IDS 1024 >>> >>> static void recirc_id_node_free(struct recirc_id_node *); >>> ----- >>> >>> Looks like kernel expects the tc flower chain id to be encoded with in >>> the first 28 bits [1], where as ovs-vswitchd is using the value of >>> recirc_id as chain id and if the recirc_id overflows 28 bits, the >>> issue is seen. >>> >>> Is my analysis correct ? I'm not too familiar with the classifier and >>> the offload code base. Hope the experts can take a look at it. >> >> Hi, Numan. Yes, your analysis seems correct. The GOTO_CHAIN action >> is an "extended action", where first 4 bits are reserved for the action >> type and the rest are a value: >> >> https://elixir.bootlin.com/linux/v6.17.6/source/tools/include/uapi/linux/pkt_cls.h#L50-L64 >> >> This means, we can't offload recirculations to chains above 28 bits. >> >> There are two things here that need fixing: >> >> 1. OVS doesn't seem to check that chain id fits into the action, blindly >> ORing it in. That should be fixed, so we are not trying to send such >> flows into kernel in the first place. >> >> 2. Somehow limit the recircualtion id space to 28 bits when the HW >> offload is enabled. I don't like this, as we'll be just adding yet >> another hack for HW offload to work, but I'm not sure what would be >> a different solution here. Note: id-pool would solve the problem >> by allocating densely packed IDs, but that may cause collisions as >> the whole process of retiring old IDs is a bit racy and we rely on >> time to guess when we can actually stop using them. Needs more >> investigation. >> >> Best regards, Ilya Maximets. > > > Thanks for the reply, Ilya. > > In one of our deployment which uses OVS 3.2.0, we see the below logs > and packet drops to the VM, > > --- > > 2025-10-30T04:26:28.474Z|78613|tc(handler25)|WARN|Kernel flower > acknowledgment does not match request! Set dpif_netlink to dbg to see > which rule caused this error. > > 2025-10-30T04:26:29.113Z|78614|tc(handler25)|WARN|Kernel flower > acknowledgment does not match request! Set dpif_netlink to dbg to see > which rule caused this error. > ------ > > Any pointers on why we are seeing the above WARN message ? OVS 3.2.0 > is missing the below backport - > https://github.com/openvswitch/ovs/commit/1857c569ee9a6432ac46d31a31f882402c215437 > Could it be because of this ?
It's likely. To confirm you'll need to enbale debug logs for the tc module, may also enbale dbg for the dpif_netlink, as the logs suggest. > Do we need to move to 3.2.2 at least for successful offloads ? If the issue above is indeed your issue, then update will remove the warning. However, this flow will not be offloaded, as it requires modification of the tp_src of the outer tunnel header which TC doesn't support. Such flows should not be common though, so I'm not sure if you actually need them offloaded. It depends on the setup. But also, you need to confirm that it is your issue first. FWIW, 3.2.0 is very old and is missing a lot of fixes, so I'd suggest updating it anyway. Also, 3.2 is EoL, so going to at least 3.3 is recommended. > The kernel version is - 5.14.0-162.6.1.el9 > > > In the below datapath flow dump, we see that there is a flow for the > first packet and the final action of this dp flow is - > recirc(0x94691). > > ------------ > recirc_id(0),in_port(18),ct_state(-new-est-rpl-trk),ct_mark(0/0x2),eth(src=b0:cf:0e:b1:5f:ff,dst=5e:8e:4a:f0:44:25),eth_type(0x8100),vlan(vid=120,pcp=0),encap(eth_type(0x0800),ipv4(src=192.0.0.0/224.0.0.0,dst=160.211.64.157,proto=1,ttl=47,frag=no)), > packets:2217, bytes:186228, used:0.330s, > actions:pop_vlan,ct(zone=24,nat),recirc(0x94691) > recirc_id(0),in_port(18),ct_state(-new-est-rpl-trk),ct_mark(0/0x2),eth(src=b0:cf:0e:b1:5f:ff,dst=5e:8e:4a:f0:44:25),eth_type(0x8100),vlan(vid=120,pcp=0),encap(eth_type(0x0800),ipv4(src=96.0.0.0/252.0.0.0,dst=160.211.64.157,proto=1,ttl=44,frag=no)), > packets:1005, bytes:84420, used:0.850s, > actions:pop_vlan,ct(zone=24,nat),recirc(0x94691) > ---------- > > But in the dp flows, we never found a flow with recirc_id(0x94691). > After a few minutes, we took the dump of dp flows and we noticed that > there was a flow matching recirc(0x94691), but it was totally > unrelated to the packet in question. > > > We also saw the below message in the ovs logs. > > ----------- > 2025-10-30T02:54:42.331Z|41074|ofproto_dpif_upcall(handler25)|INFO|received > packet on unassociated datapath port 18 (no recirculation data for > recirc_id 0x94691) > > 2025-10-30T03:15:42.380Z|43176|ofproto_dpif_upcall(handler25)|INFO|received > packet on unassociated datapath port 18 (no recirculation data for > recirc_id 0x94691) > ---------- > > IMO the packet drops were due to the missing dp flow for the recirc_id > 0x94691. > > Do you have any pointers on what could be going wrong ? Was OVS recently restarted when this was observed? That may explain the missing records for the recirculation ID in userspace. But otherwise it's hard to guess what could've gone wrong here. Since you also have the recirc_id overflow issue, it might be possible that the ID got truncated somehwere and hence it's incorrect. > > Thanks for your time > > Numan > > > >> >>> >>> >>> [1] - >>> https://github.com/torvalds/linux/blob/v6.18-rc3/net/sched/cls_api.c#L3137 >>> >>> Thanks >>> Numan >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
