On Tue, Dec 09, 2025 at 08:57:59AM +0100, Fernando Fernandez Mancera wrote:
> On 12/8/25 1:27 PM, Odintsov Vladislav wrote:
> > On 08.12.2025 15:06, Rukomoinikova Aleksandra wrote:
> > > Hi!
> > > I was testing conntrack limiting using Open vSwitch and noticed the
> > > following issue: under certain limits, a CPU lock occurred.
> > > 
> > > [  491.682936] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [ovs-
> > > dpctl:19437]
> > > 
> > > This occurs during a high packet frequency when trying to get the set
> > > limits through ovs-dpctl ct-get-limits.
> > > 
> > > In the trace, I can see that the lock occurred on attempts to acquire a
> > > spinlock.
> > > 
> > > [  491.683056]  <IRQ>
> > > [  491.683059]  _raw_spin_lock_bh+0x29/0x30
> > > [  491.683064]  count_tree+0x19b/0x1f0 [nf_conncount]
> > > [  491.683069]  ovs_ct_commit+0x196/0x490 [openvswitch]
> > > 
> > > Prior to this, in the trace, there was processing of a task from
> > > userspace (ovs-dpctl)
> > > 
> > > [  491.683236]  </IRQ>
> > > [  491.683237]  <TASK>
> > > [  491.683238]  asm_common_interrupt+0x22/0x40
> > > [  491.683240] RIP: 0010:nf_conncount_gc_list+0x18a/0x200 [nf_conncount]
> > > 
> > > Inside the nf_conncount_gc_list function, a lock is taken on
> > > nf_conncount.c:spin_trylock_bh(&list->list_lock):335. After this, the
> > > not-so-fast __nf_conncount_gc_list function is executed. If, at this
> > > moment, a packet interrupt arrives on the same сpu core (and
> > > spin_trylock_bh doesn't disable interrupts on that core), then scenario
> > > I encountered occurs: the first lock remains held, while the packet
> > > interrupt also attempts to acquire it at
> > > nf_conncount.c:spin_lock_bh(&rbconn->list.list_lock):502 while
> > > committing to conntrack. This attempt fails, leading to a soft lockup.
> > > 
> 
> Yes that makes sense. That nf_conncount_gc_list() was added there to cover a
> different scenario which might be also affected by this soft lockup under
> the same conditions.

See below, a quick browsing tells me OVS forgot to disable BH to
perform this GC.

> > > Hence my question: shouldn't we avoid calling nf_conncount_gc_list when
> > > querying limits without an skb (as OVS does in openvswitch/
> > > conntrack.c:1773)? The limit retrieval operation should be read-only
> > > regarding the contract state, not involve potential modification.
> > > 
> > > Like this:
> > > --- a/net/netfilter/nf_conncount.c
> > > +++ b/net/netfilter/nf_conncount.c
> > > @@ -495,7 +495,6 @@ count_tree(struct net *net,
> > >                int ret;
> > > 
> > >                if (!skb) {
> > > -                nf_conncount_gc_list(net, &rbconn->list);
> > >                    return rbconn->list.count;
> > >                }
> > > 
> 
> Let me think on something, I would like to provide a solution that is
> suitable for OVS + xt/nft_connlimit. Because this change would break some
> xt_connlimit use-cases. Also without this nf_conncount_gc_list(), the
> connection count wouldn't be accurate.. if some connections closed already
> the count number would still consider them..

Side note, this particular line only affects OVS, which is the only
caller passing NULL as skb:

net/netfilter/xt_connlimit.c:   connections = nf_conncount_count_skb(net, skb, 
xt_family(par), info->data, key);
net/openvswitch/conntrack.c:    connections = nf_conncount_count_skb(net, skb, 
info->family,
net/openvswitch/conntrack.c:    zone_limit.count = nf_conncount_count_skb(net, 
NULL, 0, data,

Another relevant aspect: nf_conncount_gc_list() is called _without_
disabling BH (before recent Fernando's changes).

You fix it here, Fernando:

commit c0362b5748282e22fa1592a8d3474f726ad964c2
Author: Fernando Fernandez Mancera <[email protected]>
Date:   Fri Nov 21 01:14:31 2025 +0100
 
    netfilter: nf_conncount: make nf_conncount_gc_list() to disable BH

I think it is only a matter of backporting it to -stable.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to