On Wed, Mar 11, 2026 at 03:18:09AM +0900, Hyunwoo Kim wrote:
> When a peer MEP is being deleted, cancel_delayed_work_sync() is called
> on ccm_rx_dwork before freeing. However, br_cfm_frame_rx() runs in
> softirq context under rcu_read_lock (without RTNL) and can re-schedule
> ccm_rx_dwork via ccm_rx_timer_start() between cancel_delayed_work_sync()
> returning and kfree_rcu() being called.
>
> The following is a simple race scenario:
>
> cpu0 cpu1
>
> mep_delete_implementation()
> cancel_delayed_work_sync(ccm_rx_dwork);
> br_cfm_frame_rx()
> // peer_mep still in hlist
> if (peer_mep->ccm_defect)
> ccm_rx_timer_start()
>
> queue_delayed_work(ccm_rx_dwork)
> hlist_del_rcu(&peer_mep->head);
> kfree_rcu(peer_mep, rcu);
> ccm_rx_work_expired()
> // on freed peer_mep
>
> To prevent this, cancel_delayed_work_sync() is replaced with
> disable_delayed_work_sync() in both peer MEP deletion paths, so
> that subsequent queue_delayed_work() calls from br_cfm_frame_rx()
> are silently rejected.
>
> The cc_peer_disable() helper retains cancel_delayed_work_sync()
> because it is also used for the CC enable/disable toggle path where
> the work must remain re-schedulable.
>
> Fixes: dc32cbb3dbd7 ("bridge: cfm: Kernel space implementation of CFM. CCM
> frame RX added.")
> Signed-off-by: Hyunwoo Kim <[email protected]>
Not familiar with CFM, but your explanation makes sense.
AFAICT it's not needed for ccm_tx_dwork since the delayed work re-queues
itself.
Reviewed-by: Ido Schimmel <[email protected]>