On Fri, Mar 27, 2026 at 11:34:12AM +0000, Simon Horman wrote:
> On Wed, Mar 25, 2026 at 08:24:38PM -0700, Xiang Mei wrote:
> > br_mrp_start_test() and br_mrp_start_in_test() accept the user-supplied
> > interval value from netlink without validation. When interval is 0,
> > usecs_to_jiffies(0) yields 0, causing the delayed work
> > (br_mrp_test_work_expired / br_mrp_in_test_work_expired) to reschedule
> > itself with zero delay. This creates a tight loop on system_percpu_wq
> > that allocates and transmits MRP test frames at maximum rate, exhausting
> > all system memory and causing a kernel panic via OOM deadlock.
> 
> I would suspect the primary outcome of this problem is high CPU consumption
> rather than memory exhaustion. Is there a reason to expect that
> the transmitted fames can't be consumed as fast as they are created?
> 

yes, you are right. In the default veth setup, the primary effect there 
is CPU exhaustion. However, if a qdisc with delay (e.g., netem) is 
attached to the bridge port, skbs accumulate in the qdisc and are never
freed, leading to actual OOM. Both the qdisc attachment and the MRP 
configuration are reachable from the same unprivileged context.

In our test, for a 3.5GB ram machine, with interval=0, OOM leads to a
kernel panic in 10 sec:

[   10.901868] Kernel panic - not syncing: System is deadlocked on memory
[   10.902139] CPU: 0 UID: 0 PID: 2 Comm: kthreadd Not tainted 7.0.0-rc4+ #6 
PREEMPTLAZY 
[   10.902525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
[   10.903045] Call Trace:
[   10.903163]  <TASK>
[   10.903262]  vpanic+0x694/0x780
[   10.903451]  ? __pfx_vpanic+0x10/0x10
[   10.903625]  ? __pfx__raw_spin_lock+0x10/0x10
[   10.903811]  panic+0xca/0xd0
[   10.903952]  ? __pfx_panic+0x10/0x10
[   10.904118]  ? panic_on_this_cpu+0x1a/0x40
[   10.904319]  out_of_memory+0x124e/0x1350
[   10.904497]  ? __pfx_out_of_memory+0x10/0x10
[   10.904694]  __alloc_pages_slowpath.constprop.0+0x2325/0x2dd0
[   10.904949]  ? __pfx___alloc_pages_slowpath.constprop.0+0x10/0x10
[   10.905214]  __alloc_frozen_pages_noprof+0x4f8/0x800
[   10.905445]  ? __pfx___alloc_frozen_pages_noprof+0x10/0x10
[   10.905686]  ? kasan_save_track+0x14/0x30

When interval=1, we can't crash the kernel even though netem reaching
sch->limit. You are right, it's a race between creating and comsuming.
But interval=0 gives the the user too much power to win the race easily.

> > 
> > The same zero-interval issue applies to br_mrp_start_in_test_parse()
> > for interconnect test frames.
> > 
> > Use NLA_POLICY_MIN(NLA_U32, 1) in the nla_policy tables for both
> > IFLA_BRIDGE_MRP_START_TEST_INTERVAL and
> > IFLA_BRIDGE_MRP_START_IN_TEST_INTERVAL, so zero is rejected at the
> > netlink attribute parsing layer before the value ever reaches the
> > workqueue scheduling code. This is consistent with how other bridge
> > subsystems (br_fdb, br_mst) enforce range constraints on netlink
> > attributes.
> > 
> > Fixes: 7ab1748e4ce6 ("bridge: mrp: Extend MRP netlink interface for 
> > configuring MRP interconnect")
> 
> I think you also want
> 
> Fixes: 20f6a05ef635 ("bridge: mrp: Rework the MRP netlink interface")
> 
> As highlighted by AI review.

Thanks for the reminder.

> 
> > Reported-by: Weiming Shi <[email protected]>
> > Signed-off-by: Xiang Mei <[email protected]>
> 
> ...

Reply via email to