Pablo Neira Ayuso <pa...@netfilter.org> wrote:
> On Thu, Oct 25, 2018 at 11:56:12PM +0900, Taehee Yoo wrote:
> > conn_free() holds lock with spin_lock(). and it is called by both
> > nf_conncount_lookup() and nf_conncount_gc_list().
> > nf_conncount_lookup() is bottom-half context and nf_conncount_gc_list()
> > is process context. so that spin_lock() is not safe.
> > Hence conn_free() should use spin_lock_bh() instead of spin_lock().
> > 
> > test commands:
> >    %nft add table ip filter
> >    %nft add chain ip filter input { type filter hook input priority 0\; }
> >    %nft add rule filter input meter test { ip saddr ct count over 2 } \
> >        counter
> > 
> > splat looks like:
> > [  461.996507] ================================
> > [  461.998999] WARNING: inconsistent lock state
> > [  461.998999] 4.19.0-rc6+ #22 Not tainted
> > [  461.998999] --------------------------------
> > [  461.998999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> > [  461.998999] kworker/0:2/134 [HC0[0]:SC0[0]:HE1:SE1] takes:
> > [  461.998999] 00000000a71a559a (&(&list->list_lock)->rlock){+.?.}, at: 
> > conn_free+0x69/0x2b0 [nf_conncount]
> > [  461.998999] {IN-SOFTIRQ-W} state was registered at:
> > [  461.998999]   _raw_spin_lock+0x30/0x70
> > [  461.998999]   nf_conncount_add+0x28a/0x520 [nf_conncount]
> > [  461.998999]   nft_connlimit_eval+0x401/0x580 [nft_connlimit]
> > [  461.998999]   nft_dynset_eval+0x32b/0x590 [nf_tables]
> > [  461.998999]   nft_do_chain+0x497/0x1430 [nf_tables]
> > [  461.998999]   nft_do_chain_ipv4+0x255/0x330 [nf_tables]
> > [  461.998999]   nf_hook_slow+0xb1/0x160
> > [ ... ]
> > [  461.998999] other info that might help us debug this:
> > [  461.998999]  Possible unsafe locking scenario:
> > [  461.998999]
> > [  461.998999]        CPU0
> > [  461.998999]        ----
> > [  461.998999]   lock(&(&list->list_lock)->rlock);
> > [  461.998999]   <Interrupt>
> > [  461.998999]     lock(&(&list->list_lock)->rlock);
> > [  461.998999]
> > [  461.998999]  *** DEADLOCK ***
> > [  461.998999]
> > [ ... ]
> 
> nf_conncount_add() also holds spin_lock while allocate from there is
> GFP_ATOMIC given this is called from packet path.

Good catch, yes, this needs spin_lock_bh variant too.

> tree_nodes_free() is also called from user context without _bh
> disabled.

This one is fine, both call sites hold spin_lock_bh(&nf_conncount_locks[x]).

Reply via email to