Hi,
Fedora got a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1459626
of a hang on 4.11.3 with lockdep splat:
[ 129.100206] BUG: sleeping function called from invalid context at
mm/slab.h:432
[ 129.100237] in_atomic(): 1, irqs_disabled(): 0, pid: 1793, name: tc
[ 129.100239] 2 locks held by tc/1793:
[ 129.100241] #0: (rtnl_mutex){+.+.+.}, at: []
rtnl_lock+0x17/0x20
[ 129.100250] #1: (&(>tcfa_lock)->rlock){+...+.}, at:
[] tcf_act_police_init+0x22d/0x580 [act_police]
[ 129.100259] CPU: 2 PID: 1793 Comm: tc Not tainted
4.11.3-202.fc25.x86_64+debug #1
[ 129.100261] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-1.fc25 04/01/2014
[ 129.100262] Call Trace:
[ 129.100270] dump_stack+0x86/0xc3
[ 129.100274] ___might_sleep+0x17d/0x250
[ 129.100277] __might_sleep+0x4a/0x80
[ 129.100280] kmem_cache_alloc_trace+0x1e6/0x2d0
[ 129.100284] ? gen_new_estimator+0x76/0x1d0
[ 129.100288] gen_new_estimator+0x76/0x1d0
[ 129.100293] gen_replace_estimator+0xe/0x10
[ 129.100296] tcf_act_police_init+0x255/0x580 [act_police]
[ 129.100304] tcf_action_init_1+0xb9/0x340
[ 129.100316] tcf_exts_validate+0x52/0x110
[ 129.100320] fw_change_attrs+0xbb/0x2b0 [cls_fw]
[ 129.100324] ? rcu_read_lock_sched_held+0x4a/0x80
[ 129.100329] fw_change+0x3ad/0x4bb [cls_fw]
[ 129.100338] tc_ctl_tfilter+0x470/0xa80
[ 129.100350] rtnetlink_rcv_msg+0xe6/0x210
[ 129.100354] ? sched_clock+0x9/0x10
[ 129.100358] ? sched_clock_cpu+0x11/0xc0
[ 129.100360] ? netlink_unicast+0x1cd/0x2f0
[ 129.100363] ? rtnl_newlink+0x8e0/0x8e0
[ 129.100366] netlink_rcv_skb+0xa4/0xc0
[ 129.100369] rtnetlink_rcv+0x2a/0x40
[ 129.100371] netlink_unicast+0x1f5/0x2f0
[ 129.100375] netlink_sendmsg+0x302/0x3c0
[ 129.100382] sock_sendmsg+0x38/0x50
[ 129.100385] ___sys_sendmsg+0x2f6/0x310
[ 129.100392] ? _raw_spin_unlock+0x27/0x40
[ 129.100396] ? __handle_mm_fault+0x539/0x1170
[ 129.100401] ? __do_page_fault+0x28e/0x520
[ 129.100405] ? kvm_sched_clock_read+0x25/0x40
[ 129.100407] ? sched_clock+0x9/0x10
[ 129.100409] ? sched_clock_cpu+0x11/0xc0
[ 129.100411] ? __do_page_fault+0x28e/0x520
[ 129.100420] __sys_sendmsg+0x54/0x90
[ 129.100422] ? __sys_sendmsg+0x54/0x90
[ 129.100430] SyS_sendmsg+0x12/0x20
[ 129.100432] entry_SYSCALL_64_fastpath+0x1f/0xc2
[ 129.100434] RIP: 0033:0x7f50dffca037
[ 129.100436] RSP: 002b:7ffe99d12698 EFLAGS: 0246 ORIG_RAX:
002e
[ 129.100438] RAX: ffda RBX: 7ffe99d1e7e0 RCX: 7f50dffca037
[ 129.100439] RDX: RSI: 7ffe99d126e0 RDI: 0003
[ 129.100441] RBP: 7ffe99d126e0 R08: 0001 R09:
[ 129.100442] R10: 05eb R11: 0246 R12: 7ffe99d12720
[ 129.100443] R13: 00668060 R14: 7ffe99d1a7b0 R15:
[ 129.100454] =
[ 129.100455] [ INFO: possible recursive locking detected ]
[ 129.100456] 4.11.3-202.fc25.x86_64+debug #1 Tainted: GW
[ 129.100456] -
[ 129.100457] tc/1793 is trying to acquire lock:
[ 129.100457] (&(>tcfa_lock)->rlock){+...+.}, at: []
est_fetch_counters+0x23/0x50
[ 129.100460]
but task is already holding lock:
[ 129.100460] (&(>tcfa_lock)->rlock){+...+.}, at: []
tcf_act_police_init+0x22d/0x580 [act_police]
[ 129.100463]
other info that might help us debug this:
[ 129.100463] Possible unsafe locking scenario:
[ 129.100464]CPU0
[ 129.100465]
[ 129.100465] lock(&(>tcfa_lock)->rlock);
[ 129.100466] lock(&(>tcfa_lock)->rlock);
[ 129.100467]
*** DEADLOCK ***
[ 129.100468] May be due to missing lock nesting notation
[ 129.100469] 2 locks held by tc/1793:
[ 129.100470] #0: (rtnl_mutex){+.+.+.}, at: []
rtnl_lock+0x17/0x20
[ 129.100472] #1: (&(>tcfa_lock)->rlock){+...+.}, at:
[] tcf_act_police_init+0x22d/0x580 [act_police]
[ 129.100474]
stack backtrace:
[ 129.100475] CPU: 2 PID: 1793 Comm: tc Tainted: GW
4.11.3-202.fc25.x86_64+debug #1
[ 129.100476] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-1.fc25 04/01/2014
[ 129.100477] Call Trace:
[ 129.100478] dump_stack+0x86/0xc3
[ 129.100480] __lock_acquire+0x858/0x1450
[ 129.100481] ? ___slab_alloc+0x2a9/0x5f0
[ 129.100483] lock_acquire+0xe3/0x1d0
[ 129.100484] ? gen_new_estimator+0x76/0x1d0
[ 129.100486] ? lock_acquire+0xe3/0x1d0
[ 129.100487] ? est_fetch_counters+0x23/0x50
[ 129.100488] _raw_spin_lock+0x34/0x70
[ 129.100489] ? est_fetch_counters+0x23/0x50
[ 129.100491] est_fetch_counters+0x23/0x50
[ 129.100492] gen_new_estimator+0xdf/0x1d0
[ 129.100493] gen_replace_estimator+0xe/0x10
[ 129.100495] tcf_act_police_init+0x255/0x580 [act_police]
[ 129.100496] tcf_action_init_1+0xb9/0x340
[ 129.100499] tcf_exts_validate+0x52/0x110
[ 129.100537]