Hi Aaron,
I tested two stages of this work on x86_64, single-node setup:
1. Early RFC (this patch, June 2025): basic socket action plumbing,
ODP parsing, and socket_lookup configuration.
2. Current development branch (sockmap_2026_feb, built as a scratch
kernel based on RHEL 9.8 / 5.14.0-611.el9_7), which extends this
RFC with: rcuify list, sockmap get/del commands, action list rework,
packet cmd fix, sockmap fixes, tuple detail improvements, and
flow-socket association. Paired with the OVS userspace sockmap
branch (sockmap_cmds).
Tests on the current development branch:
- ODP action round-trip parsing (valid and invalid): pass
- Socket action generation via ofproto/trace for TCP (IPv4/IPv6): pass
- Non-TCP exclusion (ICMP, UDP): pass
- socket_lookup enable/disable per port: pass
- socket_lookup with group recirculation: pass
- OpenFlow regression with socket_lookup: pass
- Conntrack regression with socket_lookup: pass
- 2-namespace TCP performance: pass
During 1000-namespace scale testing (2000 veth pairs, socket_lookup
enabled on all ports), the following WARNING burst was observed in the
kernel console log:
[20652.730148] WARNING: CPU: 118 PID: 304284 at
net/core/skbuff.c:1000 skb_release_head_state+0x95/0xa0
(185 occurrences within 79ms, followed by BUG: scheduling while
atomic in OVS upcall handler thread handler2052)
This is triggered by ARP table overflow under 2000 veth pairs, which
floods the OVS netlink upcall path. The underlying issue is that skbs
with netlink_skb_destructor are passed to consume_skb() without first
calling skb_orphan(). The WARNING is reproducible on a stock RHEL 9.8
kernel + stock OVS 3.7 without any of these patches, confirming it is
a pre-existing kernel issue unrelated to this work.
Tested-by: Minxi Hou <[email protected]>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev