[PATCH 4.8 14/35] rcu: Fix soft lockup for rcu_nocb_kthread
4.8-stable review patch. If anyone has any objections, please let me know. -- From: Ding Tianhongcommit bedc1969150d480c462cdac320fa944b694a7162 upstream. Carrying out the following steps results in a softlockup in the RCU callback-offload (rcuo) kthreads: 1. Connect to ixgbevf, and set the speed to 10Gb/s. 2. Use ifconfig to bring the nic up and down repeatedly. [ 317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready [ 368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15] [ 368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 368.106005] task: 88057dd8a220 ti: 88057dd9c000 task.ti: 88057dd9c000 [ 368.106005] RIP: 0010:[] [] fib_table_lookup+0x14/0x390 [ 368.106005] RSP: 0018:88061fc83ce8 EFLAGS: 0286 [ 368.106005] RAX: 0001 RBX: 020155c0 RCX: 0001 [ 368.106005] RDX: 88061fc83d50 RSI: 88061fc83d70 RDI: 880036d11a00 [ 368.106005] RBP: 88061fc83d08 R08: 0001 R09: [ 368.106005] R10: 880036d11a00 R11: 819e0900 R12: 88061fc83c58 [ 368.106005] R13: 816154dd R14: 88061fc83d08 R15: 020155c0 [ 368.106005] FS: () GS:88061fc8() knlGS: [ 368.106005] CS: 0010 DS: ES: CR0: 80050033 [ 368.106005] CR2: 7f8c2aee9c40 CR3: 00057b222000 CR4: 000407e0 [ 368.106005] DR0: DR1: DR2: [ 368.106005] DR3: DR6: 0ff0 DR7: 0400 [ 368.106005] Stack: [ 368.106005] 01c0 88057b766000 8802e380b000 88057af03e00 [ 368.106005] 88061fc83dc0 815349a6 88061fc83d40 814ee146 [ 368.106005] 8802e380af00 e380af00 819e0900 020155c001c0 [ 368.106005] Call Trace: [ 368.106005] [ 368.106005] [ 368.106005] [] ip_route_input_noref+0x516/0xbd0 [ 368.106005] [] ? skb_release_data+0xd6/0x110 [ 368.106005] [] ? kfree_skb+0x3a/0xa0 [ 368.106005] [] ip_rcv_finish+0x29f/0x350 [ 368.106005] [] ip_rcv+0x234/0x380 [ 368.106005] [] __netif_receive_skb_core+0x676/0x870 [ 368.106005] [] __netif_receive_skb+0x18/0x60 [ 368.106005] [] process_backlog+0xae/0x180 [ 368.106005] [] net_rx_action+0x152/0x240 [ 368.106005] [] __do_softirq+0xef/0x280 [ 368.106005] [] call_softirq+0x1c/0x30 [ 368.106005] [ 368.106005] [ 368.106005] [] do_softirq+0x65/0xa0 [ 368.106005] [] local_bh_enable+0x94/0xa0 [ 368.106005] [] rcu_nocb_kthread+0x232/0x370 [ 368.106005] [] ? wake_up_bit+0x30/0x30 [ 368.106005] [] ? rcu_start_gp+0x40/0x40 [ 368.106005] [] kthread+0xcf/0xe0 [ 368.106005] [] ? kthread_create_on_node+0x140/0x140 [ 368.106005] [] ret_from_fork+0x58/0x90 [ 368.106005] [] ? kthread_create_on_node+0x140/0x140 ==cut here== It turns out that the rcuos callback-offload kthread is busy processing a very large quantity of RCU callbacks, and it is not reliquishing the CPU while doing so. This commit therefore adds an cond_resched_rcu_qs() within the loop to allow other tasks to run. Signed-off-by: Ding Tianhong [ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ] Signed-off-by: Paul E. McKenney Cc: Dhaval Giani Signed-off-by: Greg Kroah-Hartman --- kernel/rcu/tree_plugin.h |1 + 1 file changed, 1 insertion(+) --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -2173,6 +2173,7 @@ static int rcu_nocb_kthread(void *arg) cl++; c++; local_bh_enable(); + cond_resched_rcu_qs(); list = next; } trace_rcu_batch_end(rdp->rsp->name, c, !!list, 0, 0, 1);
[PATCH 4.8 14/35] rcu: Fix soft lockup for rcu_nocb_kthread
4.8-stable review patch. If anyone has any objections, please let me know. -- From: Ding Tianhong commit bedc1969150d480c462cdac320fa944b694a7162 upstream. Carrying out the following steps results in a softlockup in the RCU callback-offload (rcuo) kthreads: 1. Connect to ixgbevf, and set the speed to 10Gb/s. 2. Use ifconfig to bring the nic up and down repeatedly. [ 317.005148] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready [ 368.106005] BUG: soft lockup - CPU#1 stuck for 22s! [rcuos/1:15] [ 368.106005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 368.106005] task: 88057dd8a220 ti: 88057dd9c000 task.ti: 88057dd9c000 [ 368.106005] RIP: 0010:[] [] fib_table_lookup+0x14/0x390 [ 368.106005] RSP: 0018:88061fc83ce8 EFLAGS: 0286 [ 368.106005] RAX: 0001 RBX: 020155c0 RCX: 0001 [ 368.106005] RDX: 88061fc83d50 RSI: 88061fc83d70 RDI: 880036d11a00 [ 368.106005] RBP: 88061fc83d08 R08: 0001 R09: [ 368.106005] R10: 880036d11a00 R11: 819e0900 R12: 88061fc83c58 [ 368.106005] R13: 816154dd R14: 88061fc83d08 R15: 020155c0 [ 368.106005] FS: () GS:88061fc8() knlGS: [ 368.106005] CS: 0010 DS: ES: CR0: 80050033 [ 368.106005] CR2: 7f8c2aee9c40 CR3: 00057b222000 CR4: 000407e0 [ 368.106005] DR0: DR1: DR2: [ 368.106005] DR3: DR6: 0ff0 DR7: 0400 [ 368.106005] Stack: [ 368.106005] 01c0 88057b766000 8802e380b000 88057af03e00 [ 368.106005] 88061fc83dc0 815349a6 88061fc83d40 814ee146 [ 368.106005] 8802e380af00 e380af00 819e0900 020155c001c0 [ 368.106005] Call Trace: [ 368.106005] [ 368.106005] [ 368.106005] [] ip_route_input_noref+0x516/0xbd0 [ 368.106005] [] ? skb_release_data+0xd6/0x110 [ 368.106005] [] ? kfree_skb+0x3a/0xa0 [ 368.106005] [] ip_rcv_finish+0x29f/0x350 [ 368.106005] [] ip_rcv+0x234/0x380 [ 368.106005] [] __netif_receive_skb_core+0x676/0x870 [ 368.106005] [] __netif_receive_skb+0x18/0x60 [ 368.106005] [] process_backlog+0xae/0x180 [ 368.106005] [] net_rx_action+0x152/0x240 [ 368.106005] [] __do_softirq+0xef/0x280 [ 368.106005] [] call_softirq+0x1c/0x30 [ 368.106005] [ 368.106005] [ 368.106005] [] do_softirq+0x65/0xa0 [ 368.106005] [] local_bh_enable+0x94/0xa0 [ 368.106005] [] rcu_nocb_kthread+0x232/0x370 [ 368.106005] [] ? wake_up_bit+0x30/0x30 [ 368.106005] [] ? rcu_start_gp+0x40/0x40 [ 368.106005] [] kthread+0xcf/0xe0 [ 368.106005] [] ? kthread_create_on_node+0x140/0x140 [ 368.106005] [] ret_from_fork+0x58/0x90 [ 368.106005] [] ? kthread_create_on_node+0x140/0x140 ==cut here== It turns out that the rcuos callback-offload kthread is busy processing a very large quantity of RCU callbacks, and it is not reliquishing the CPU while doing so. This commit therefore adds an cond_resched_rcu_qs() within the loop to allow other tasks to run. Signed-off-by: Ding Tianhong [ paulmck: Substituted cond_resched_rcu_qs for cond_resched. ] Signed-off-by: Paul E. McKenney Cc: Dhaval Giani Signed-off-by: Greg Kroah-Hartman --- kernel/rcu/tree_plugin.h |1 + 1 file changed, 1 insertion(+) --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -2173,6 +2173,7 @@ static int rcu_nocb_kthread(void *arg) cl++; c++; local_bh_enable(); + cond_resched_rcu_qs(); list = next; } trace_rcu_batch_end(rdp->rsp->name, c, !!list, 0, 0, 1);