Re: [tipc-discussion] soft lockup for TIPC
Hi Jon, I am sorry that we cannot upgrade to a more recent kernel because we follow the Red Hat release. Can you give us one patch for this issue in 3.x kernel? -Original Message- From: Jon Maloy [mailto:jon.ma...@ericsson.com] Sent: 2016年11月22日 0:37 To: XIANG Haiming; tipc-discussion@lists.sourceforge.net Subject: RE: soft lockup for TIPC Hi Xiang, Although the version you are using has the same number (I am planning to step it to 3.0.0 soon) as the current one in the latest kernels (4.x), it is a very different species indeed. Almost all code has been rewritten, and in some cases more than once. I would strongly suggest you upgrade to a more recent kernel if ever possible, as it is very hard for us to maintain TIPC in 3.x kernels. If that is impossible for you, we will have to look into what options we have. Regards ///jon > -Original Message- > From: XIANG Haiming [mailto:haiming.xi...@alcatel-sbell.com.cn] > Sent: Sunday, 20 November, 2016 21:35 > To: tipc-discussion@lists.sourceforge.net > Subject: [tipc-discussion] soft lockup for TIPC > > Hi all, > > The version for TIPC which we use is TIPC version 2.0.0. > The OS info is as follow: > > Red Hat Enterprise Linux Server 7.2 (Maipo) > Kernel 3.10.0-327.18.2.el7.x86_64 on an x86_64 > > We meet two soft lockup issue about TIPC, please help us to solve this issue. > Thank you > > One issue is as follow: > > > [85502.601198] BUG: soft lockup - CPU#0 stuck for 22s! [scm:2649] > [85502.603585] Modules linked in: iptable_filter ip6table_mangle xt_limit > iptable_mangle ip6table_filter ip6_tables igb_uio(OE) uio tipc(OE) 8021q garp > stp > mrp llc bonding dm_mirror dm_region_hash dm_log dm_mod ppdev > crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper cryptd pcspkr i6300esb virtio_balloon cirrus syscopyarea > sysfillrect > sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core parport_pc parport > binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache > jbd2 virtio_blk virtio_console crct10dif_pclmul crct10dif_common crc32c_intel > serio_raw ixgbevf virtio_pci virtio_ring virtio ata_generic pata_acpi > ata_piix libata > floppy > [85502.618210] CPU: 0 PID: 2649 Comm: scm Tainted: G OEL > > 3.10.0-327.18.2.el7.x86_64 #1 > [85502.620482] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 > [85502.622361] task: 880405da5080 ti: 8800db68c000 task.ti: > 8800db68c000 > [85502.624400] RIP: 0010:[] [] > _raw_spin_lock_bh+0x3d/0x50 > [85502.626558] RSP: 0018:8800db68fa68 EFLAGS: 0202 > [85502.628412] RAX: 56f6 RBX: 880406600800 RCX: > 712a > [85502.630419] RDX: 712c RSI: 712c RDI: > a03f8548 > [85502.632423] RBP: 8800db68fa70 R08: 02c0 R09: > 0500 > [85502.634421] R10: 88040f001500 R11: 8800db68fc58 R12: > 81519be1 > [85502.636413] R13: 8800db68fa08 R14: R15: > 0240 > [85502.638415] FS: () GS:88041fc0(0063) > knlGS:d1495b40 > [85502.640514] CS: 0010 DS: 002b ES: 002b CR0: 80050033 > [85502.642399] CR2: d37c8208 CR3: 0003cfff7000 CR4: > 000406f0 > [85502.644420] DR0: DR1: DR2: > > [85502.646416] DR3: DR6: 0ff0 DR7: > 0400 > [85502.648393] Stack: > [85502.649916] a03f8548 8800db68fa90 a03e2afb > 880036242c00 > [85502.652019] 880406600800 8800db68fac0 a03e5c1b > 880036242c00 > [85502.654108] 880036b5a484 88003698e800 8800db68fb30 > 8800db68fba8 > [85502.656179] Call Trace: > [85502.657742] [] tipc_bearer_blocked+0x1b/0x30 [tipc] > [85502.659678] [] link_send_buf_fast+0x5b/0xb0 [tipc] > [85502.661592] [] tipc_link_send_sections_fast+0xe6/0x630 > [tipc] > [85502.663594] [] ? _raw_read_unlock_bh+0x16/0x20 > [85502.665473] [] ? tipc_nametbl_translate+0xc0/0x1f0 > [tipc] > [85502.667441] [] tipc_send2name+0x155/0x1d0 [tipc] > [85502.669351] [] send_msg+0x1eb/0x530 [tipc] > [85502.671205] [] sock_sendmsg+0xb0/0xf0 > [85502.673028] [] ? futex_wait+0x193/0x280 > [85502.674859] [] SYSC_sendto+0x121/0x1c0 > [85502.676676] [] ? __do_page_fault+0x16d/0x450 > [85502.678540] [] ? poll_select_copy_remaining+0x62/0x130 > [85502.680489] [] SyS_sendto+0xe/0x10 > [85502.682277] [] compat_sys_socketcall+0x173/0x2a0 > [85502.684193] [] sysenter_dispatch+0x7/0x21 > [85502.686046] Code: b8 00 00 02 00 f0 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 > 5b 5d > c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 0f b7 0b 66 39 ca 74 ea f3 90 <83> e8 01 > 75 f1 48 > 89 df 66 66 66 90 66 66 90 eb e0 66 90 66 66 > [85503.372189] BUG: soft lockup - CPU#2 stuck for 23s! [nodemgr:2560] > [85503.374181] Modules linked in: iptable_filter ip6table_mangle xt_limit > iptable_mangle ip6table_filter ip6_tables igb_uio(OE)
Re: [tipc-discussion] v4.7: soft lockup when releasing a socket
Hi Partha, My test has 4 nodes, 2 of which are alternately rebooting. When the rebooted node rejoins a few minutes pass and then the other node is rebooted. I am not printing out link stats and believe that the the other code is not doing so either, when nodes leave or rejoin. JT On Tue, Nov 22, 2016 at 2:22 AM, Parthasarathy Bhuvaragan < parthasarathy.bhuvara...@ericsson.com> wrote: > Hi, > > There is an other branch where softlockup for nametbl_lock occurs. > > tipc_named_rcv() Grabs nametbl_lock > tipc_update_nametbl() (publish/withdraw) > tipc_node_subscribe()/unsubscribe() > tipc_node_write_unlock() > << lockup occurs if it needs to process NODE UP/DOWN LINK > UP/DOWN, as it grabs nametbl_lock again >> > > /Partha > > > On 11/21/2016 01:04 PM, Parthasarathy Bhuvaragan wrote: > >> Hi, >> >> tipc_nametbl_withdraw() triggers the softlockup as it tries to grab >> nametbl_lock twice if the node triggered a TIPC_NOTIFY_LINK_DOWN event >> while its is running. The erroneous call chain is: >> >> tipc_nametbl_withdraw() Grab nametbl_lock >> tipc_named_process_backlog() >> tipc_update_nametbl() >> if (dtype == WITHDRAWAL) tipc_node_unsubscribe() >> tipc_node_write_unlock() >> if (flags & TIPC_NOTIFY_LINK_DOWN) tipc_nametbl_withdraw() >>spin_lock_bh(>nametbl_lock); << Soft Lockup >> >> >> Three callers which can cause this under module exit: >> >> Case1: >> tipc_exit_net() >> tipc_nametbl_withdraw() Grab nametbl_lock >> >> Case2: >> tipc_server_stop() >> tipc_conn_kref_release >> tipc_sock_release >> sock_release() >> tipc_release() >> tipc_sk_withdraw() >> tipc_nametbl_withdraw() >> >> Case3: >> tipc_server_stop() >> tipc_conn_kref_release() >> kernel_bind() >> tipc_bind() >> tipc_sk_withdraw() >> tipc_nametbl_withdraw() >> >> I will work on a solution for this. >> >> What kind of test were you performing when this occurred (linkup/down)? >> Do you read link statistics periodically in your tests? >> >> /Partha >> >> On 11/21/2016 05:30 AM, John Thompson wrote: >> >>> Hi Partha, >>> >>> I was doing some some more testing today and have still observed the >>> problem (contrary to what I had emailed earlier). >>> >>> Here is the kernel dump. >>> >>> <0>NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Pluggable >>> Serve:2221] >>> <6>Modules linked in: tipc jitterentropy_rng echainiv drbg >>> platform_driver(O) >>> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P O >>> <6>task: ae54ced0 ti: aec42000 task.ti: aec42000 >>> <6>NIP: 8069257c LR: c13ebf50 CTR: 80692540 >>> <6>REGS: aec43ad0 TRAP: 0901 Tainted: P O >>> <6>MSR: 00029002CR: 48002444 XER: >>> <6> >>> <6>GPR00: c13ea408 aec43b80 ae54ced0 a624690c a6271d84 a39a60cc >>> fffd >>> <6>GPR08: aeefbbc8 0001 0001 0004 80692540 >>> <6>NIP [8069257c] _raw_spin_lock_bh+0x3c/0x70 >>> <6>LR [c13ebf50] tipc_nametbl_unsubscribe+0x50/0x120 [tipc] >>> <6>Call Trace: >>> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable) >>> <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc] >>> <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc] >>> <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc] >>> <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc] >>> <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc] >>> <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc] >>> <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc] >>> <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc] >>> <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc] >>> <6>[aec43d30] [c13f4848] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc] >>> >> TIPC_CMD_SHOW_LINK_STATS or TIPC_NL_LINK_GET >> >>> <6>[aec43d70] [804f29e0] sock_release+0x30/0xf0 >>> <6>[aec43d80] [804f2ab4] sock_close+0x14/0x30 >>> <6>[aec43d90] [80105844] __fput+0x94/0x200 >>> <6>[aec43db0] [8003dca4] task_work_run+0xd4/0x100 >>> <6>[aec43dd0] [80023620] do_exit+0x280/0x980 >>> <6>[aec43e10] [80024c48] do_group_exit+0x48/0xb0 >>> <6>[aec43e30] [80030344] get_signal+0x244/0x4f0 >>> <6>[aec43e80] [80007734] do_signal+0x34/0x1c0 >>> <6>[aec43f30] [800079a8] do_notify_resume+0x68/0x80 >>> <6>[aec43f40] [8000fa1c] do_user_signal+0x74/0xc4 >>> <6>--- interrupt: c00 at 0xf4f3d08 >>> <6>LR = 0xf4f3ce8 >>> <6>Instruction dump: >>> <6>912a0008 3941 7d201828 2c09 40820010 7d40192d 40a2fff0 >>> 7c2004ac >>> <6>2f89 4dbe0020 7c210b78 8123 <2f89> 40befff4 7c421378 >>> 7d201828 >>> <0>Kernel panic - not syncing: softlockup: hung tasks >>> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P O L >>> <6>Call Trace: >>> <6>[aec43930] [80694e20] dump_stack+0x84/0xb0 (unreliable) >>> <6>[aec43940] [80692ca8] panic+0xd8/0x214 >>> <6>[aec439a0] [800a0258]
Re: [tipc-discussion] soft lockup for TIPC
Hi Xiang, Although the version you are using has the same number (I am planning to step it to 3.0.0 soon) as the current one in the latest kernels (4.x), it is a very different species indeed. Almost all code has been rewritten, and in some cases more than once. I would strongly suggest you upgrade to a more recent kernel if ever possible, as it is very hard for us to maintain TIPC in 3.x kernels. If that is impossible for you, we will have to look into what options we have. Regards ///jon > -Original Message- > From: XIANG Haiming [mailto:haiming.xi...@alcatel-sbell.com.cn] > Sent: Sunday, 20 November, 2016 21:35 > To: tipc-discussion@lists.sourceforge.net > Subject: [tipc-discussion] soft lockup for TIPC > > Hi all, > > The version for TIPC which we use is TIPC version 2.0.0. > The OS info is as follow: > > Red Hat Enterprise Linux Server 7.2 (Maipo) > Kernel 3.10.0-327.18.2.el7.x86_64 on an x86_64 > > We meet two soft lockup issue about TIPC, please help us to solve this issue. > Thank you > > One issue is as follow: > > > [85502.601198] BUG: soft lockup - CPU#0 stuck for 22s! [scm:2649] > [85502.603585] Modules linked in: iptable_filter ip6table_mangle xt_limit > iptable_mangle ip6table_filter ip6_tables igb_uio(OE) uio tipc(OE) 8021q garp > stp > mrp llc bonding dm_mirror dm_region_hash dm_log dm_mod ppdev > crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper cryptd pcspkr i6300esb virtio_balloon cirrus syscopyarea > sysfillrect > sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core parport_pc parport > binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache > jbd2 virtio_blk virtio_console crct10dif_pclmul crct10dif_common crc32c_intel > serio_raw ixgbevf virtio_pci virtio_ring virtio ata_generic pata_acpi > ata_piix libata > floppy > [85502.618210] CPU: 0 PID: 2649 Comm: scm Tainted: G OEL > > 3.10.0-327.18.2.el7.x86_64 #1 > [85502.620482] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 > [85502.622361] task: 880405da5080 ti: 8800db68c000 task.ti: > 8800db68c000 > [85502.624400] RIP: 0010:[] [] > _raw_spin_lock_bh+0x3d/0x50 > [85502.626558] RSP: 0018:8800db68fa68 EFLAGS: 0202 > [85502.628412] RAX: 56f6 RBX: 880406600800 RCX: > 712a > [85502.630419] RDX: 712c RSI: 712c RDI: > a03f8548 > [85502.632423] RBP: 8800db68fa70 R08: 02c0 R09: > 0500 > [85502.634421] R10: 88040f001500 R11: 8800db68fc58 R12: > 81519be1 > [85502.636413] R13: 8800db68fa08 R14: R15: > 0240 > [85502.638415] FS: () GS:88041fc0(0063) > knlGS:d1495b40 > [85502.640514] CS: 0010 DS: 002b ES: 002b CR0: 80050033 > [85502.642399] CR2: d37c8208 CR3: 0003cfff7000 CR4: > 000406f0 > [85502.644420] DR0: DR1: DR2: > > [85502.646416] DR3: DR6: 0ff0 DR7: > 0400 > [85502.648393] Stack: > [85502.649916] a03f8548 8800db68fa90 a03e2afb > 880036242c00 > [85502.652019] 880406600800 8800db68fac0 a03e5c1b > 880036242c00 > [85502.654108] 880036b5a484 88003698e800 8800db68fb30 > 8800db68fba8 > [85502.656179] Call Trace: > [85502.657742] [] tipc_bearer_blocked+0x1b/0x30 [tipc] > [85502.659678] [] link_send_buf_fast+0x5b/0xb0 [tipc] > [85502.661592] [] tipc_link_send_sections_fast+0xe6/0x630 > [tipc] > [85502.663594] [] ? _raw_read_unlock_bh+0x16/0x20 > [85502.665473] [] ? tipc_nametbl_translate+0xc0/0x1f0 > [tipc] > [85502.667441] [] tipc_send2name+0x155/0x1d0 [tipc] > [85502.669351] [] send_msg+0x1eb/0x530 [tipc] > [85502.671205] [] sock_sendmsg+0xb0/0xf0 > [85502.673028] [] ? futex_wait+0x193/0x280 > [85502.674859] [] SYSC_sendto+0x121/0x1c0 > [85502.676676] [] ? __do_page_fault+0x16d/0x450 > [85502.678540] [] ? poll_select_copy_remaining+0x62/0x130 > [85502.680489] [] SyS_sendto+0xe/0x10 > [85502.682277] [] compat_sys_socketcall+0x173/0x2a0 > [85502.684193] [] sysenter_dispatch+0x7/0x21 > [85502.686046] Code: b8 00 00 02 00 f0 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 > 5b 5d > c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 0f b7 0b 66 39 ca 74 ea f3 90 <83> e8 01 > 75 f1 48 > 89 df 66 66 66 90 66 66 90 eb e0 66 90 66 66 > [85503.372189] BUG: soft lockup - CPU#2 stuck for 23s! [nodemgr:2560] > [85503.374181] Modules linked in: iptable_filter ip6table_mangle xt_limit > iptable_mangle ip6table_filter ip6_tables igb_uio(OE) uio tipc(OE) 8021q garp > stp > mrp llc bonding dm_mirror dm_region_hash dm_log dm_mod ppdev > crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > ablk_helper cryptd pcspkr i6300esb virtio_balloon cirrus syscopyarea > sysfillrect > sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core parport_pc parport > binfmt_misc nfsd auth_rpcgss
[tipc-discussion] [PATCH net v2 2/2] tipc: fix nametbl_lock soft lockup at module exit
Commit 333f796235a527 ("tipc: fix a race condition leading to subscriber refcnt bug") reveals a soft lockup while acquiring nametbl_lock. Before commit 333f796235a527, we call tipc_conn_shutdown() from tipc_close_conn() in the context of tipc_topsrv_stop(). In that context, we are allowed to grab the nametbl_lock. In commit 333f796235a527, i moved the tipc_conn_release (renamed from tipc_conn_shutdown) to the connection refcount cleanup. This allows either tipc_nametbl_withdraw() or tipc_topsrv_stop() to perform tipc_sock_release(). Since tipc_exit_net() first calls tipc_topsrv_stop() and then tipc_nametble_withdraw() increases the chances for the later to perform the connection cleanup. The soft lockup occurs in the call chain of tipc_nametbl_withdraw(), when it performs the tipc_conn_kref_release() as it tries to grab nametbl_lock again while holding it already. tipc_nametbl_withdraw() grabs nametbl_lock tipc_nametbl_remove_publ() tipc_subscrp_report_overlap() tipc_subscrp_send_event() tipc_conn_sendmsg() << if (con->flags != CF_CONNECTED) we do conn_put(), triggering the cleanup as refcount=0. >> tipc_conn_kref_release tipc_sock_release tipc_conn_release tipc_subscrb_delete tipc_subscrp_delete tipc_nametbl_unsubscribe << Soft Lockup >> Until now, tipc_server_stop() grabs and releases the idr_lock twice for every connection. Once to find the connection and second to unset connection flag, remove it from list and decrement the refcount. The above lockup occurs when tipc_nametbl_withdraw() grabs the connection in between the two, thus owning the connection and triggering the cleanup while decrementing the refcount later. In this commit, we perform: - call tipc_nametbl_withdraw() before tipc_topsrv_stop() to avoid: 1. soft lockup 2. its too late to actually notify the subscribers, as the topology server might already have started shutting down. - In tipc_server_stop(), we remove all the connections from connection list in the scope of the idr_lock to prevent any other thread finding a connection which has unset CF_CONNECTED in its flags. Fixes: 333f796235a52727 ("tipc: fix a race condition leading to subscriber refcnt bug") Signed-off-by: Parthasarathy Bhuvaragan--- v2: commit message update. cleanup tipc_server_stop() as per Ying Xue. --- net/tipc/core.c | 4 net/tipc/net.c| 2 -- net/tipc/server.c | 13 - 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/net/tipc/core.c b/net/tipc/core.c index 236b043a4156..3ea1452e2f06 100644 --- a/net/tipc/core.c +++ b/net/tipc/core.c @@ -93,6 +93,10 @@ static int __net_init tipc_init_net(struct net *net) static void __net_exit tipc_exit_net(struct net *net) { + struct tipc_net *tn = net_generic(net, tipc_net_id); + + tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0, + tn->own_addr); tipc_topsrv_stop(net); tipc_net_stop(net); tipc_bcast_stop(net); diff --git a/net/tipc/net.c b/net/tipc/net.c index 28bf4feeb81c..92696cc6e763 100644 --- a/net/tipc/net.c +++ b/net/tipc/net.c @@ -130,8 +130,6 @@ void tipc_net_stop(struct net *net) if (!tn->own_addr) return; - tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0, - tn->own_addr); rtnl_lock(); tipc_bearer_stop(net); tipc_node_stop(net); diff --git a/net/tipc/server.c b/net/tipc/server.c index 215849ce453d..54c23ecccd26 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -623,16 +623,19 @@ int tipc_server_start(struct tipc_server *s) void tipc_server_stop(struct tipc_server *s) { struct tipc_conn *con; - int total = 0; int id; spin_lock_bh(>idr_lock); - for (id = 0; total < s->idr_in_use; id++) { + for (id = 0; s->idr_in_use; id++) { con = idr_find(>conn_idr, id); - if (con) { - total++; + if (con && test_and_clear_bit(CF_CONNECTED, >flags)) { + idr_remove(>conn_idr, con->conid); + s->idr_in_use--; + /* release spinlock as kernel_sock_shutdown can sleep. +*/ spin_unlock_bh(>idr_lock); - tipc_close_conn(con); + kernel_sock_shutdown(con->sock, SHUT_RDWR); + conn_put(con); spin_lock_bh(>idr_lock); } } -- 2.1.4 -- ___ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion
[tipc-discussion] [PATCH net v2 1/2] tipc: fix nametbl_lock soft lockup at node/link events
We trigger a soft lockup as we grab nametbl_lock twice if the node has a pending node up/down or link up/down event while: - we process an incoming named message in tipc_named_rcv() and perform an tipc_update_nametbl(). - we have pending backlog items in the name distributor queue during a nametable update using tipc_nametbl_publish() or tipc_nametbl_withdraw(). The following are the call chain associated: tipc_named_rcv() Grabs nametbl_lock tipc_update_nametbl() (publish/withdraw) tipc_node_subscribe()/unsubscribe() tipc_node_write_unlock() << lockup occurs if an outstanding node/link event exits, as we grabs nametbl_lock again >> tipc_nametbl_withdraw() Grab nametbl_lock tipc_named_process_backlog() tipc_update_nametbl() << rest as above >> The function tipc_node_write_unlock(), in addition to releasing the lock processes the outstanding node/link up/down events. To do this, we need to grab the nametbl_lock again leading to the lockup. In this commit we fix the soft lockup by introducing a fast variant of node_unlock(), where we just release the lock. We adapt the node_subscribe()/node_unsubscribe() to use the fast variants. Signed-off-by: Parthasarathy Bhuvaragan--- net/tipc/node.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/net/tipc/node.c b/net/tipc/node.c index 9d2f4c2b08ab..27753325e06e 100644 --- a/net/tipc/node.c +++ b/net/tipc/node.c @@ -263,6 +263,11 @@ static void tipc_node_write_lock(struct tipc_node *n) write_lock_bh(>lock); } +static void tipc_node_write_unlock_fast(struct tipc_node *n) +{ + write_unlock_bh(>lock); +} + static void tipc_node_write_unlock(struct tipc_node *n) { struct net *net = n->net; @@ -417,7 +422,7 @@ void tipc_node_subscribe(struct net *net, struct list_head *subscr, u32 addr) } tipc_node_write_lock(n); list_add_tail(subscr, >publ_list); - tipc_node_write_unlock(n); + tipc_node_write_unlock_fast(n); tipc_node_put(n); } @@ -435,7 +440,7 @@ void tipc_node_unsubscribe(struct net *net, struct list_head *subscr, u32 addr) } tipc_node_write_lock(n); list_del_init(subscr); - tipc_node_write_unlock(n); + tipc_node_write_unlock_fast(n); tipc_node_put(n); } -- 2.1.4 -- ___ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion
Re: [tipc-discussion] [PATCH net-next 3/3] tipc: reduce risk of user starvation during link congestion
I am having some new doubts about our current link congestion criteria. See below. ///jon > -Original Message- > From: Jon Maloy [mailto:jon.ma...@ericsson.com] > Sent: Monday, 21 November, 2016 09:57 > To: tipc-discussion@lists.sourceforge.net; Parthasarathy Bhuvaragan >; Ying Xue > ; Jon Maloy > Cc: ma...@donjonn.com; thompa@gmail.com > Subject: [PATCH net-next 3/3] tipc: reduce risk of user starvation during link > congestion > > The socket code currently handles link congestion by either blocking > and trying to send again when the congestion has abated, or just > returning to the user with -EAGAIN and let him re-try later. > > This mechanism is prone to starvation, because the wakeup algorithm is > non-atomic. During the time the link issues a wakeup signal, until the > socket wakes up and re-attempts sending, other senders may have come > in between and occupied the free buffer space in the link. This in turn > may lead to a socket having to make many send attempts before it is > successful. In extremely loaded systems we have observed latency times > of several seconds before a low-priority socket is able to send out a > message. > > In this commit, we simplify this mechanism and reduce the risk of the > described scenario happening. When a message is sent to a congested > link, we now let it keep the message in the wakeup-item that it has to > create anyway, and immediately add it to the link's send queue when > enough space has been freed up. Only when this is done do we issue a > wakeup signal to the socket, which can now immediately go on and send > the next message, if any. > > The fact that a socket now can consider a message sent even when the > link returns a congestion code means that the sending socket code can > be simplified. Also, since this is a good opportunity to get rid of the > obsolete 'mtu change' condition in the three socket send functions, we > now choose to refactor those functions completely. > > Signed-off-by: Jon Maloy > --- > net/tipc/bcast.c | 2 +- > net/tipc/link.c | 90 -- > net/tipc/msg.h| 8 +- > net/tipc/node.c | 2 +- > net/tipc/socket.c | 346 > -- > 5 files changed, 209 insertions(+), 239 deletions(-) > > diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c > index aa1babb..1a56cab 100644 > --- a/net/tipc/bcast.c > +++ b/net/tipc/bcast.c > @@ -174,7 +174,7 @@ static void tipc_bcbase_xmit(struct net *net, struct > sk_buff_head *xmitq) > *and to identified node local sockets > * @net: the applicable net namespace > * @list: chain of buffers containing message > - * Consumes the buffer chain, except when returning -ELINKCONG > + * Consumes the buffer chain. > * Returns 0 if success, otherwise errno: -ELINKCONG,-EHOSTUNREACH,- > EMSGSIZE > */ > int tipc_bcast_xmit(struct net *net, struct sk_buff_head *list) > diff --git a/net/tipc/link.c b/net/tipc/link.c > index 1055164..db3dcab 100644 > --- a/net/tipc/link.c > +++ b/net/tipc/link.c > @@ -774,60 +774,77 @@ int tipc_link_timeout(struct tipc_link *l, struct > sk_buff_head *xmitq) > return rc; > } > > +static bool tipc_link_congested(struct tipc_link *l, int imp) > +{ > + int i; > + > + if (skb_queue_empty(>backlogq)) > + return false; > + > + /* Match msg importance against this and all higher backlog limits: */ > + for (i = imp; i <= TIPC_SYSTEM_IMPORTANCE; i++) { > + if (unlikely(l->backlog[i].len >= l->backlog[i].limit)) > + return true; > + } > + return false; > +} After introducing this patch I ran into a problem with spontaneous and premature connect disconnects in the ptty test program. After some trouble shooting I realized that with the new algorithm, high priority shutdown messages sometimes bypass messages which are reported "delivered" but still waiting in the wakeup pending queue. This was easy to fix, by adding a potentially blocking tipc_wait_for_cond() call in the __tipc_shutdown() call. But I also realized that we may have (and always had) another starvation scenario, which is not addressed with this patch. Gergely's algorithm (above) fixes that a socket may not be starved by lower- or equal priority users, but this may still be caused by higher-priority users. While a lower-prio message is waiting in the wakeup queue, higher-prio users may drive their own levels into congestion, thus potentially stopping the lower-prio user to send its message for a long time. After the current patch, wouldn't it be better to go back to the algorithm I originally suggested, where we only check the sender's own level, and nothing else. At a given level, messages would now be delivered in a strict first-come-first-served fashion, and no starvation can occur. What do you think? ///jon
[tipc-discussion] [PATCH net-next 3/3] tipc: reduce risk of user starvation during link congestion
The socket code currently handles link congestion by either blocking and trying to send again when the congestion has abated, or just returning to the user with -EAGAIN and let him re-try later. This mechanism is prone to starvation, because the wakeup algorithm is non-atomic. During the time the link issues a wakeup signal, until the socket wakes up and re-attempts sending, other senders may have come in between and occupied the free buffer space in the link. This in turn may lead to a socket having to make many send attempts before it is successful. In extremely loaded systems we have observed latency times of several seconds before a low-priority socket is able to send out a message. In this commit, we simplify this mechanism and reduce the risk of the described scenario happening. When a message is sent to a congested link, we now let it keep the message in the wakeup-item that it has to create anyway, and immediately add it to the link's send queue when enough space has been freed up. Only when this is done do we issue a wakeup signal to the socket, which can now immediately go on and send the next message, if any. The fact that a socket now can consider a message sent even when the link returns a congestion code means that the sending socket code can be simplified. Also, since this is a good opportunity to get rid of the obsolete 'mtu change' condition in the three socket send functions, we now choose to refactor those functions completely. Signed-off-by: Jon Maloy--- net/tipc/bcast.c | 2 +- net/tipc/link.c | 90 -- net/tipc/msg.h| 8 +- net/tipc/node.c | 2 +- net/tipc/socket.c | 346 -- 5 files changed, 209 insertions(+), 239 deletions(-) diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c index aa1babb..1a56cab 100644 --- a/net/tipc/bcast.c +++ b/net/tipc/bcast.c @@ -174,7 +174,7 @@ static void tipc_bcbase_xmit(struct net *net, struct sk_buff_head *xmitq) *and to identified node local sockets * @net: the applicable net namespace * @list: chain of buffers containing message - * Consumes the buffer chain, except when returning -ELINKCONG + * Consumes the buffer chain. * Returns 0 if success, otherwise errno: -ELINKCONG,-EHOSTUNREACH,-EMSGSIZE */ int tipc_bcast_xmit(struct net *net, struct sk_buff_head *list) diff --git a/net/tipc/link.c b/net/tipc/link.c index 1055164..db3dcab 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -774,60 +774,77 @@ int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) return rc; } +static bool tipc_link_congested(struct tipc_link *l, int imp) +{ + int i; + + if (skb_queue_empty(>backlogq)) + return false; + + /* Match msg importance against this and all higher backlog limits: */ + for (i = imp; i <= TIPC_SYSTEM_IMPORTANCE; i++) { + if (unlikely(l->backlog[i].len >= l->backlog[i].limit)) + return true; + } + return false; +} + /** * link_schedule_user - schedule a message sender for wakeup after congestion - * @link: congested link + * @l: congested link * @list: message that was attempted sent * Create pseudo msg to send back to user when congestion abates - * Does not consume buffer list + * Consumes buffer list */ -static int link_schedule_user(struct tipc_link *link, struct sk_buff_head *list) +static int link_schedule_user(struct tipc_link *l, struct sk_buff_head *list) { - struct tipc_msg *msg = buf_msg(skb_peek(list)); - int imp = msg_importance(msg); - u32 oport = msg_origport(msg); - u32 addr = tipc_own_addr(link->net); + struct tipc_msg *hdr = buf_msg(skb_peek(list)); + int imp = msg_importance(hdr); + u32 oport = msg_origport(hdr); + u32 dnode = tipc_own_addr(l->net); struct sk_buff *skb; + struct sk_buff_head *pkts; /* This really cannot happen... */ if (unlikely(imp > TIPC_CRITICAL_IMPORTANCE)) { - pr_warn("%s<%s>, send queue full", link_rst_msg, link->name); + pr_warn("%s<%s>, send queue full", link_rst_msg, l->name); return -ENOBUFS; } - /* Non-blocking sender: */ - if (TIPC_SKB_CB(skb_peek(list))->wakeup_pending) - return -ELINKCONG; /* Create and schedule wakeup pseudo message */ skb = tipc_msg_create(SOCK_WAKEUP, 0, INT_H_SIZE, 0, - addr, addr, oport, 0, 0); + dnode, l->addr, oport, 0, 0); if (!skb) return -ENOBUFS; - TIPC_SKB_CB(skb)->chain_sz = skb_queue_len(list); - TIPC_SKB_CB(skb)->chain_imp = imp; - skb_queue_tail(>wakeupq, skb); - link->stats.link_congs++; + msg_set_dest_droppable(buf_msg(skb), true); + skb_queue_tail(>wakeupq, skb); + + /* Keep the packet chain until we can send it */
[tipc-discussion] [PATCH net-next 2/3] tipc: modify struct tipc_plist to be more versatile
During multicast reception we currently use a simple linked list with push/pop semantics to store port numbers. We now see a need for a more generic list for storing values of type u32. We therefore make some modifications to this list, while replacing the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new functions which will come to use in the next commits. Signed-off-by: Jon Maloy--- net/tipc/name_table.c | 100 -- net/tipc/name_table.h | 21 --- net/tipc/socket.c | 8 ++-- 3 files changed, 83 insertions(+), 46 deletions(-) diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c index e190460..5a86df1 100644 --- a/net/tipc/name_table.c +++ b/net/tipc/name_table.c @@ -608,7 +608,7 @@ u32 tipc_nametbl_translate(struct net *net, u32 type, u32 instance, * Returns non-zero if any off-node ports overlap */ int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper, - u32 limit, struct tipc_plist *dports) + u32 limit, struct list_head *dports) { struct name_seq *seq; struct sub_seq *sseq; @@ -633,7 +633,7 @@ int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper, info = sseq->info; list_for_each_entry(publ, >node_list, node_list) { if (publ->scope <= limit) - tipc_plist_push(dports, publ->ref); + u32_push(dports, publ->ref); } if (info->cluster_list_size != info->node_list_size) @@ -1022,40 +1022,84 @@ int tipc_nl_name_table_dump(struct sk_buff *skb, struct netlink_callback *cb) return skb->len; } -void tipc_plist_push(struct tipc_plist *pl, u32 port) +struct u32_item { + struct list_head list; + u32 value; +}; + +bool u32_find(struct list_head *l, u32 value) { - struct tipc_plist *nl; + struct u32_item *item; - if (likely(!pl->port)) { - pl->port = port; - return; + list_for_each_entry(item, l, list) { + if (item->value == value) + return true; } - if (pl->port == port) - return; - list_for_each_entry(nl, >list, list) { - if (nl->port == port) - return; + return false; +} + +bool u32_push(struct list_head *l, u32 value) +{ + struct u32_item *item; + + list_for_each_entry(item, l, list) { + if (item->value == value) + return false; + } + item = kmalloc(sizeof(*item), GFP_ATOMIC); + if (unlikely(!item)) + return false; + + item->value = value; + list_add(>list, l); + return true; +} + +u32 u32_pop(struct list_head *l) +{ + struct u32_item *item; + u32 value = 0; + + if (list_empty(l)) + return 0; + item = list_first_entry(l, typeof(*item), list); + value = item->value; + list_del(>list); + kfree(item); + return value; +} + +bool u32_del(struct list_head *l, u32 value) +{ + struct u32_item *item, *tmp; + + list_for_each_entry_safe(item, tmp, l, list) { + if (item->value != value) + continue; + list_del(>list); + kfree(item); + return true; } - nl = kmalloc(sizeof(*nl), GFP_ATOMIC); - if (nl) { - nl->port = port; - list_add(>list, >list); + return false; +} + +void u32_list_purge(struct list_head *l) +{ + struct u32_item *item, *tmp; + + list_for_each_entry_safe(item, tmp, l, list) { + list_del(>list); + kfree(item); } } -u32 tipc_plist_pop(struct tipc_plist *pl) +int u32_list_len(struct list_head *l) { - struct tipc_plist *nl; - u32 port = 0; + struct u32_item *item; + int i = 0; - if (likely(list_empty(>list))) { - port = pl->port; - pl->port = 0; - return port; + list_for_each_entry(item, l, list) { + i++; } - nl = list_first_entry(>list, typeof(*nl), list); - port = nl->port; - list_del(>list); - kfree(nl); - return port; + return i; } diff --git a/net/tipc/name_table.h b/net/tipc/name_table.h index 1524a73..c89bb3f 100644 --- a/net/tipc/name_table.h +++ b/net/tipc/name_table.h @@ -99,7 +99,7 @@ int tipc_nl_name_table_dump(struct sk_buff *skb, struct netlink_callback *cb); u32 tipc_nametbl_translate(struct net *net, u32 type, u32 instance, u32 *node); int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper, - u32 limit, struct tipc_plist *dports); + u32 limit, struct list_head
[tipc-discussion] [PATCH net-next 1/3] tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions
The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very similar. The latter function is also called from two locations, and there will be more in the coming commits, which will all need to test on different conditions. Instead of making yet another duplicates of the function, we now introduce a new macro tipc_wait_for_cond() where the wakeup condition can be stated as an argument to the call. This macro replaces all current and future uses of the two functions, which can now be eliminated. Signed-off-by: Jon Maloy--- net/tipc/socket.c | 108 +- 1 file changed, 49 insertions(+), 59 deletions(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index 4916d8f..f084430 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -110,7 +110,6 @@ static void tipc_write_space(struct sock *sk); static void tipc_sock_destruct(struct sock *sk); static int tipc_release(struct socket *sock); static int tipc_accept(struct socket *sock, struct socket *new_sock, int flags); -static int tipc_wait_for_sndmsg(struct socket *sock, long *timeo_p); static void tipc_sk_timeout(unsigned long data); static int tipc_sk_publish(struct tipc_sock *tsk, uint scope, struct tipc_name_seq const *seq); @@ -334,6 +333,49 @@ static int tipc_set_sk_state(struct sock *sk, int state) return res; } +static int tipc_sk_sock_err(struct socket *sock, long *timeout) +{ + struct sock *sk = sock->sk; + int err = sock_error(sk); + int typ = sock->type; + + if (err) + return err; + if (typ == SOCK_STREAM || typ == SOCK_SEQPACKET) { + if (sk->sk_state == TIPC_DISCONNECTING) + return -EPIPE; + else if (!tipc_sk_connected(sk)) + return -ENOTCONN; + } else if (sk->sk_shutdown & SEND_SHUTDOWN) { + return -EPIPE; + } + if (!*timeout) + return -EAGAIN; + if (signal_pending(current)) + return sock_intr_errno(*timeout); + + return 0; +} + +#define tipc_wait_for_cond(sock_, timeout_, condition_) \ + ({ \ + int rc_ = 0;\ + int done_ = 0; \ + \ + while (!(condition_) && !done_) { \ + struct sock *sk_ = sock->sk;\ + DEFINE_WAIT_FUNC(wait_, woken_wake_function); \ + \ + rc_ = tipc_sk_sock_err(sock_, timeout_);\ + if (rc_)\ + break; \ + prepare_to_wait(sk_sleep(sk_), _, TASK_INTERRUPTIBLE);\ + done_ = sk_wait_event(sk_, timeout_, (condition_), _);\ + remove_wait_queue(sk_sleep(sk), _);\ + } \ + rc_;\ +}) + /** * tipc_sk_create - create a TIPC socket * @net: network namespace (must be default network) @@ -719,7 +761,7 @@ static int tipc_sendmcast(struct socket *sock, struct tipc_name_seq *seq, if (rc == -ELINKCONG) { tsk->link_cong = 1; - rc = tipc_wait_for_sndmsg(sock, ); + rc = tipc_wait_for_cond(sock, , !tsk->link_cong); if (!rc) continue; } @@ -828,31 +870,6 @@ static void tipc_sk_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb, kfree_skb(skb); } -static int tipc_wait_for_sndmsg(struct socket *sock, long *timeo_p) -{ - DEFINE_WAIT_FUNC(wait, woken_wake_function); - struct sock *sk = sock->sk; - struct tipc_sock *tsk = tipc_sk(sk); - int done; - - do { - int err = sock_error(sk); - if (err) - return err; - if (sk->sk_shutdown & SEND_SHUTDOWN) - return -EPIPE; - if (!*timeo_p) - return -EAGAIN; - if (signal_pending(current)) - return sock_intr_errno(*timeo_p); - - add_wait_queue(sk_sleep(sk), ); - done = sk_wait_event(sk, timeo_p, !tsk->link_cong, ); - remove_wait_queue(sk_sleep(sk), ); - } while (!done); - return 0; -} - /** * tipc_sendmsg - send message in connectionless manner * @sock: socket structure @@ -968,7 +985,7 @@ static int
[tipc-discussion] [PATCH net-next 0/3] tipc: improve interaction socket-link
We fix a very real starvation problem that may occur when the link level runs into send buffer congestion. At the same time we make the interaction between the socket and link layer simpler and more consistent. Jon Maloy (3): tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions tipc: modify struct tipc_plist to be more versatile tipc: reduce risk of user starvation during link congestion net/tipc/bcast.c | 2 +- net/tipc/link.c | 90 +- net/tipc/msg.h| 8 +- net/tipc/name_table.c | 100 +++ net/tipc/name_table.h | 21 +-- net/tipc/node.c | 2 +- net/tipc/socket.c | 448 ++ 7 files changed, 334 insertions(+), 337 deletions(-) -- 2.7.4 -- ___ tipc-discussion mailing list tipc-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tipc-discussion
Re: [tipc-discussion] v4.7: soft lockup when releasing a socket
Hi, There is an other branch where softlockup for nametbl_lock occurs. tipc_named_rcv() Grabs nametbl_lock tipc_update_nametbl() (publish/withdraw) tipc_node_subscribe()/unsubscribe() tipc_node_write_unlock() << lockup occurs if it needs to process NODE UP/DOWN LINK UP/DOWN, as it grabs nametbl_lock again >> /Partha On 11/21/2016 01:04 PM, Parthasarathy Bhuvaragan wrote: > Hi, > > tipc_nametbl_withdraw() triggers the softlockup as it tries to grab > nametbl_lock twice if the node triggered a TIPC_NOTIFY_LINK_DOWN event > while its is running. The erroneous call chain is: > > tipc_nametbl_withdraw() Grab nametbl_lock > tipc_named_process_backlog() > tipc_update_nametbl() > if (dtype == WITHDRAWAL) tipc_node_unsubscribe() > tipc_node_write_unlock() > if (flags & TIPC_NOTIFY_LINK_DOWN) tipc_nametbl_withdraw() >spin_lock_bh(>nametbl_lock); << Soft Lockup >> > > Three callers which can cause this under module exit: > > Case1: > tipc_exit_net() > tipc_nametbl_withdraw() Grab nametbl_lock > > Case2: > tipc_server_stop() > tipc_conn_kref_release > tipc_sock_release > sock_release() > tipc_release() > tipc_sk_withdraw() > tipc_nametbl_withdraw() > > Case3: > tipc_server_stop() > tipc_conn_kref_release() > kernel_bind() > tipc_bind() > tipc_sk_withdraw() > tipc_nametbl_withdraw() > > I will work on a solution for this. > > What kind of test were you performing when this occurred (linkup/down)? > Do you read link statistics periodically in your tests? > > /Partha > > On 11/21/2016 05:30 AM, John Thompson wrote: >> Hi Partha, >> >> I was doing some some more testing today and have still observed the >> problem (contrary to what I had emailed earlier). >> >> Here is the kernel dump. >> >> <0>NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Pluggable >> Serve:2221] >> <6>Modules linked in: tipc jitterentropy_rng echainiv drbg >> platform_driver(O) >> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P O >> <6>task: ae54ced0 ti: aec42000 task.ti: aec42000 >> <6>NIP: 8069257c LR: c13ebf50 CTR: 80692540 >> <6>REGS: aec43ad0 TRAP: 0901 Tainted: P O >> <6>MSR: 00029002CR: 48002444 XER: >> <6> >> <6>GPR00: c13ea408 aec43b80 ae54ced0 a624690c a6271d84 a39a60cc >> fffd >> <6>GPR08: aeefbbc8 0001 0001 0004 80692540 >> <6>NIP [8069257c] _raw_spin_lock_bh+0x3c/0x70 >> <6>LR [c13ebf50] tipc_nametbl_unsubscribe+0x50/0x120 [tipc] >> <6>Call Trace: >> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable) >> <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc] >> <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc] >> <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc] >> <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc] >> <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc] >> <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc] >> <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc] >> <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc] >> <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc] >> <6>[aec43d30] [c13f4848] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc] > TIPC_CMD_SHOW_LINK_STATS or TIPC_NL_LINK_GET >> <6>[aec43d70] [804f29e0] sock_release+0x30/0xf0 >> <6>[aec43d80] [804f2ab4] sock_close+0x14/0x30 >> <6>[aec43d90] [80105844] __fput+0x94/0x200 >> <6>[aec43db0] [8003dca4] task_work_run+0xd4/0x100 >> <6>[aec43dd0] [80023620] do_exit+0x280/0x980 >> <6>[aec43e10] [80024c48] do_group_exit+0x48/0xb0 >> <6>[aec43e30] [80030344] get_signal+0x244/0x4f0 >> <6>[aec43e80] [80007734] do_signal+0x34/0x1c0 >> <6>[aec43f30] [800079a8] do_notify_resume+0x68/0x80 >> <6>[aec43f40] [8000fa1c] do_user_signal+0x74/0xc4 >> <6>--- interrupt: c00 at 0xf4f3d08 >> <6>LR = 0xf4f3ce8 >> <6>Instruction dump: >> <6>912a0008 3941 7d201828 2c09 40820010 7d40192d 40a2fff0 7c2004ac >> <6>2f89 4dbe0020 7c210b78 8123 <2f89> 40befff4 7c421378 >> 7d201828 >> <0>Kernel panic - not syncing: softlockup: hung tasks >> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P O L >> <6>Call Trace: >> <6>[aec43930] [80694e20] dump_stack+0x84/0xb0 (unreliable) >> <6>[aec43940] [80692ca8] panic+0xd8/0x214 >> <6>[aec439a0] [800a0258] watchdog_timer_fn+0x2d8/0x2e0 >> <6>[aec439f0] [8007ae58] __hrtimer_run_queues+0x118/0x1d0 >> <6>[aec43a30] [8007b608] hrtimer_interrupt+0xd8/0x270 >> <6>[aec43a80] [8000983c] __timer_interrupt+0xac/0x1b0 >> <6>[aec43aa0] [80009b70] timer_interrupt+0xb0/0xe0 >> <6>[aec43ac0] [8000f450] ret_from_except+0x0/0x18 >> <6>--- interrupt: 901 at _raw_spin_lock_bh+0x3c/0x70 >> <6>LR = tipc_nametbl_unsubscribe+0x50/0x120 [tipc] >> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable) >> <6>[aec43ba0] [c13ea408]
Re: [tipc-discussion] v4.7: soft lockup when releasing a socket
Hi, tipc_nametbl_withdraw() triggers the softlockup as it tries to grab nametbl_lock twice if the node triggered a TIPC_NOTIFY_LINK_DOWN event while its is running. The erroneous call chain is: tipc_nametbl_withdraw() Grab nametbl_lock tipc_named_process_backlog() tipc_update_nametbl() if (dtype == WITHDRAWAL) tipc_node_unsubscribe() tipc_node_write_unlock() if (flags & TIPC_NOTIFY_LINK_DOWN) tipc_nametbl_withdraw() spin_lock_bh(>nametbl_lock); << Soft Lockup >> Three callers which can cause this under module exit: Case1: tipc_exit_net() tipc_nametbl_withdraw() Grab nametbl_lock Case2: tipc_server_stop() tipc_conn_kref_release tipc_sock_release sock_release() tipc_release() tipc_sk_withdraw() tipc_nametbl_withdraw() Case3: tipc_server_stop() tipc_conn_kref_release() kernel_bind() tipc_bind() tipc_sk_withdraw() tipc_nametbl_withdraw() I will work on a solution for this. What kind of test were you performing when this occurred (linkup/down)? Do you read link statistics periodically in your tests? /Partha On 11/21/2016 05:30 AM, John Thompson wrote: > Hi Partha, > > I was doing some some more testing today and have still observed the > problem (contrary to what I had emailed earlier). > > Here is the kernel dump. > > <0>NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Pluggable > Serve:2221] > <6>Modules linked in: tipc jitterentropy_rng echainiv drbg > platform_driver(O) > <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P O > <6>task: ae54ced0 ti: aec42000 task.ti: aec42000 > <6>NIP: 8069257c LR: c13ebf50 CTR: 80692540 > <6>REGS: aec43ad0 TRAP: 0901 Tainted: P O > <6>MSR: 00029002CR: 48002444 XER: > <6> > <6>GPR00: c13ea408 aec43b80 ae54ced0 a624690c a6271d84 a39a60cc > fffd > <6>GPR08: aeefbbc8 0001 0001 0004 80692540 > <6>NIP [8069257c] _raw_spin_lock_bh+0x3c/0x70 > <6>LR [c13ebf50] tipc_nametbl_unsubscribe+0x50/0x120 [tipc] > <6>Call Trace: > <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable) > <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc] > <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc] > <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc] > <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc] > <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc] > <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc] > <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc] > <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc] > <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc] > <6>[aec43d30] [c13f4848] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc] TIPC_CMD_SHOW_LINK_STATS or TIPC_NL_LINK_GET > <6>[aec43d70] [804f29e0] sock_release+0x30/0xf0 > <6>[aec43d80] [804f2ab4] sock_close+0x14/0x30 > <6>[aec43d90] [80105844] __fput+0x94/0x200 > <6>[aec43db0] [8003dca4] task_work_run+0xd4/0x100 > <6>[aec43dd0] [80023620] do_exit+0x280/0x980 > <6>[aec43e10] [80024c48] do_group_exit+0x48/0xb0 > <6>[aec43e30] [80030344] get_signal+0x244/0x4f0 > <6>[aec43e80] [80007734] do_signal+0x34/0x1c0 > <6>[aec43f30] [800079a8] do_notify_resume+0x68/0x80 > <6>[aec43f40] [8000fa1c] do_user_signal+0x74/0xc4 > <6>--- interrupt: c00 at 0xf4f3d08 > <6>LR = 0xf4f3ce8 > <6>Instruction dump: > <6>912a0008 3941 7d201828 2c09 40820010 7d40192d 40a2fff0 7c2004ac > <6>2f89 4dbe0020 7c210b78 8123 <2f89> 40befff4 7c421378 > 7d201828 > <0>Kernel panic - not syncing: softlockup: hung tasks > <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P O L > <6>Call Trace: > <6>[aec43930] [80694e20] dump_stack+0x84/0xb0 (unreliable) > <6>[aec43940] [80692ca8] panic+0xd8/0x214 > <6>[aec439a0] [800a0258] watchdog_timer_fn+0x2d8/0x2e0 > <6>[aec439f0] [8007ae58] __hrtimer_run_queues+0x118/0x1d0 > <6>[aec43a30] [8007b608] hrtimer_interrupt+0xd8/0x270 > <6>[aec43a80] [8000983c] __timer_interrupt+0xac/0x1b0 > <6>[aec43aa0] [80009b70] timer_interrupt+0xb0/0xe0 > <6>[aec43ac0] [8000f450] ret_from_except+0x0/0x18 > <6>--- interrupt: 901 at _raw_spin_lock_bh+0x3c/0x70 > <6>LR = tipc_nametbl_unsubscribe+0x50/0x120 [tipc] > <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable) > <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc] > <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc] > <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc] > <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc] > <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc] > <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc] > <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc] > <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc] > <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc] >