On 12/9/25 9:42 AM, Pablo Neira Ayuso wrote:
On Tue, Dec 09, 2025 at 08:57:59AM +0100, Fernando Fernandez Mancera wrote:
On 12/8/25 1:27 PM, Odintsov Vladislav wrote:
On 08.12.2025 15:06, Rukomoinikova Aleksandra wrote:
Hi!
I was testing conntrack limiting using Open vSwitch and noticed the
following issue: under certain limits, a CPU lock occurred.

[  491.682936] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [ovs-
dpctl:19437]

This occurs during a high packet frequency when trying to get the set
limits through ovs-dpctl ct-get-limits.

In the trace, I can see that the lock occurred on attempts to acquire a
spinlock.

[  491.683056]  <IRQ>
[  491.683059]  _raw_spin_lock_bh+0x29/0x30
[  491.683064]  count_tree+0x19b/0x1f0 [nf_conncount]
[  491.683069]  ovs_ct_commit+0x196/0x490 [openvswitch]

Prior to this, in the trace, there was processing of a task from
userspace (ovs-dpctl)

[  491.683236]  </IRQ>
[  491.683237]  <TASK>
[  491.683238]  asm_common_interrupt+0x22/0x40
[  491.683240] RIP: 0010:nf_conncount_gc_list+0x18a/0x200 [nf_conncount]

Inside the nf_conncount_gc_list function, a lock is taken on
nf_conncount.c:spin_trylock_bh(&list->list_lock):335. After this, the
not-so-fast __nf_conncount_gc_list function is executed. If, at this
moment, a packet interrupt arrives on the same сpu core (and
spin_trylock_bh doesn't disable interrupts on that core), then scenario
I encountered occurs: the first lock remains held, while the packet
interrupt also attempts to acquire it at
nf_conncount.c:spin_lock_bh(&rbconn->list.list_lock):502 while
committing to conntrack. This attempt fails, leading to a soft lockup.


Yes that makes sense. That nf_conncount_gc_list() was added there to cover a
different scenario which might be also affected by this soft lockup under
the same conditions.

See below, a quick browsing tells me OVS forgot to disable BH to
perform this GC.

Hence my question: shouldn't we avoid calling nf_conncount_gc_list when
querying limits without an skb (as OVS does in openvswitch/
conntrack.c:1773)? The limit retrieval operation should be read-only
regarding the contract state, not involve potential modification.

Like this:
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -495,7 +495,6 @@ count_tree(struct net *net,
                int ret;

                if (!skb) {
-                nf_conncount_gc_list(net, &rbconn->list);
                    return rbconn->list.count;
                }


Let me think on something, I would like to provide a solution that is
suitable for OVS + xt/nft_connlimit. Because this change would break some
xt_connlimit use-cases. Also without this nf_conncount_gc_list(), the
connection count wouldn't be accurate.. if some connections closed already
the count number would still consider them..

Side note, this particular line only affects OVS, which is the only
caller passing NULL as skb:

net/netfilter/xt_connlimit.c:   connections = nf_conncount_count_skb(net, skb, 
xt_family(par), info->data, key);
net/openvswitch/conntrack.c:    connections = nf_conncount_count_skb(net, skb, 
info->family,
net/openvswitch/conntrack.c:    zone_limit.count = nf_conncount_count_skb(net, 
NULL, 0, data,

Another relevant aspect: nf_conncount_gc_list() is called _without_
disabling BH (before recent Fernando's changes).

You fix it here, Fernando:

commit c0362b5748282e22fa1592a8d3474f726ad964c2
Author: Fernando Fernandez Mancera <[email protected]>
Date:   Fri Nov 21 01:14:31 2025 +0100
netfilter: nf_conncount: make nf_conncount_gc_list() to disable BH

I think it is only a matter of backporting it to -stable.

That is right, thanks Pablo. Just a note, that commit doesn't have a fixes tag because I just did it to simplify its use so it won't be picked automatically.. should we send a request to stable mailing list?

Thanks,
Fernando.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to