Re: [tipc-discussion] soft lockup for TIPC

2016-11-21 Thread XIANG Haiming
Hi Jon,

I am sorry that we cannot upgrade to a more recent kernel because we follow the 
Red Hat release.

Can you give us one patch for this issue in 3.x kernel?


-Original Message-
From: Jon Maloy [mailto:jon.ma...@ericsson.com] 
Sent: 2016年11月22日 0:37
To: XIANG Haiming; tipc-discussion@lists.sourceforge.net
Subject: RE: soft lockup for TIPC

Hi Xiang,
Although the version you are using has the same number (I am planning to step 
it to 3.0.0 soon) as the current one in the latest kernels (4.x), it is a very 
different species indeed. Almost all code has been rewritten, and in some cases 
more than once. I would strongly suggest you upgrade to a more recent kernel if 
ever possible, as it is very hard for us to maintain TIPC in 3.x kernels. 

If that is impossible for you, we will have to look into what options we have.

Regards
///jon


> -Original Message-
> From: XIANG Haiming [mailto:haiming.xi...@alcatel-sbell.com.cn]
> Sent: Sunday, 20 November, 2016 21:35
> To: tipc-discussion@lists.sourceforge.net
> Subject: [tipc-discussion] soft lockup for TIPC
> 
> Hi all,
> 
> The version for TIPC which we use is TIPC version 2.0.0.
> The OS info is as follow:
> 
> Red Hat Enterprise Linux Server 7.2 (Maipo)
> Kernel 3.10.0-327.18.2.el7.x86_64 on an x86_64
> 
> We meet two soft lockup issue about TIPC, please help us to solve this issue.
> Thank you
> 
> One issue is as follow:
> 
> 
> [85502.601198] BUG: soft lockup - CPU#0 stuck for 22s! [scm:2649]
> [85502.603585] Modules linked in: iptable_filter ip6table_mangle xt_limit
> iptable_mangle ip6table_filter ip6_tables igb_uio(OE) uio tipc(OE) 8021q garp 
> stp
> mrp llc bonding dm_mirror dm_region_hash dm_log dm_mod ppdev
> crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
> ablk_helper cryptd pcspkr i6300esb virtio_balloon cirrus syscopyarea 
> sysfillrect
> sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core parport_pc parport
> binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache
> jbd2 virtio_blk virtio_console crct10dif_pclmul crct10dif_common crc32c_intel
> serio_raw ixgbevf virtio_pci virtio_ring virtio ata_generic pata_acpi 
> ata_piix libata
> floppy
> [85502.618210] CPU: 0 PID: 2649 Comm: scm Tainted: G   OEL 
> 
> 3.10.0-327.18.2.el7.x86_64 #1
> [85502.620482] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [85502.622361] task: 880405da5080 ti: 8800db68c000 task.ti:
> 8800db68c000
> [85502.624400] RIP: 0010:[]  []
> _raw_spin_lock_bh+0x3d/0x50
> [85502.626558] RSP: 0018:8800db68fa68  EFLAGS: 0202
> [85502.628412] RAX: 56f6 RBX: 880406600800 RCX:
> 712a
> [85502.630419] RDX: 712c RSI: 712c RDI:
> a03f8548
> [85502.632423] RBP: 8800db68fa70 R08: 02c0 R09:
> 0500
> [85502.634421] R10: 88040f001500 R11: 8800db68fc58 R12: 
> 81519be1
> [85502.636413] R13: 8800db68fa08 R14:  R15:
> 0240
> [85502.638415] FS:  () GS:88041fc0(0063)
> knlGS:d1495b40
> [85502.640514] CS:  0010 DS: 002b ES: 002b CR0: 80050033
> [85502.642399] CR2: d37c8208 CR3: 0003cfff7000 CR4:
> 000406f0
> [85502.644420] DR0:  DR1:  DR2:
> 
> [85502.646416] DR3:  DR6: 0ff0 DR7:
> 0400
> [85502.648393] Stack:
> [85502.649916]  a03f8548 8800db68fa90 a03e2afb
> 880036242c00
> [85502.652019]  880406600800 8800db68fac0 a03e5c1b
> 880036242c00
> [85502.654108]  880036b5a484 88003698e800 8800db68fb30
> 8800db68fba8
> [85502.656179] Call Trace:
> [85502.657742]  [] tipc_bearer_blocked+0x1b/0x30 [tipc]
> [85502.659678]  [] link_send_buf_fast+0x5b/0xb0 [tipc]
> [85502.661592]  [] tipc_link_send_sections_fast+0xe6/0x630
> [tipc]
> [85502.663594]  [] ? _raw_read_unlock_bh+0x16/0x20
> [85502.665473]  [] ? tipc_nametbl_translate+0xc0/0x1f0 
> [tipc]
> [85502.667441]  [] tipc_send2name+0x155/0x1d0 [tipc]
> [85502.669351]  [] send_msg+0x1eb/0x530 [tipc]
> [85502.671205]  [] sock_sendmsg+0xb0/0xf0
> [85502.673028]  [] ? futex_wait+0x193/0x280
> [85502.674859]  [] SYSC_sendto+0x121/0x1c0
> [85502.676676]  [] ? __do_page_fault+0x16d/0x450
> [85502.678540]  [] ? poll_select_copy_remaining+0x62/0x130
> [85502.680489]  [] SyS_sendto+0xe/0x10
> [85502.682277]  [] compat_sys_socketcall+0x173/0x2a0
> [85502.684193]  [] sysenter_dispatch+0x7/0x21
> [85502.686046] Code: b8 00 00 02 00 f0 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 
> 5b 5d
> c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 0f b7 0b 66 39 ca 74 ea f3 90 <83> e8 01 
> 75 f1 48
> 89 df 66 66 66 90 66 66 90 eb e0 66 90 66 66
> [85503.372189] BUG: soft lockup - CPU#2 stuck for 23s! [nodemgr:2560]
> [85503.374181] Modules linked in: iptable_filter ip6table_mangle xt_limit
> iptable_mangle ip6table_filter ip6_tables igb_uio(OE) 

Re: [tipc-discussion] v4.7: soft lockup when releasing a socket

2016-11-21 Thread John Thompson
Hi Partha,

My test has 4 nodes, 2 of which are alternately rebooting.  When the
rebooted node rejoins a few minutes pass and then the other node is
rebooted.
I am not printing out link stats and believe that the the other code is not
doing so either, when nodes leave or rejoin.

JT


On Tue, Nov 22, 2016 at 2:22 AM, Parthasarathy Bhuvaragan <
parthasarathy.bhuvara...@ericsson.com> wrote:

> Hi,
>
> There is an other branch where softlockup for nametbl_lock occurs.
>
> tipc_named_rcv() Grabs nametbl_lock
>   tipc_update_nametbl() (publish/withdraw)
> tipc_node_subscribe()/unsubscribe()
>   tipc_node_write_unlock()
>  << lockup occurs if it needs to process NODE UP/DOWN LINK
> UP/DOWN, as it grabs nametbl_lock again >>
>
> /Partha
>
>
> On 11/21/2016 01:04 PM, Parthasarathy Bhuvaragan wrote:
>
>> Hi,
>>
>> tipc_nametbl_withdraw() triggers the softlockup as it tries to grab
>> nametbl_lock twice if the node triggered a TIPC_NOTIFY_LINK_DOWN event
>> while its is running. The erroneous call chain is:
>>
>>   tipc_nametbl_withdraw() Grab nametbl_lock
>> tipc_named_process_backlog()
>>   tipc_update_nametbl()
>> if (dtype == WITHDRAWAL) tipc_node_unsubscribe()
>>   tipc_node_write_unlock()
>> if (flags & TIPC_NOTIFY_LINK_DOWN) tipc_nametbl_withdraw()
>>spin_lock_bh(>nametbl_lock);  << Soft Lockup >>
>>
>> Three callers which can cause this under module exit:
>>
>> Case1:
>>   tipc_exit_net()
>> tipc_nametbl_withdraw() Grab nametbl_lock
>>
>> Case2:
>>   tipc_server_stop()
>> tipc_conn_kref_release
>>   tipc_sock_release
>> sock_release()
>>   tipc_release()
>> tipc_sk_withdraw()
>>   tipc_nametbl_withdraw()
>>
>> Case3:
>>   tipc_server_stop()
>> tipc_conn_kref_release()
>>   kernel_bind()
>> tipc_bind()
>>   tipc_sk_withdraw()
>> tipc_nametbl_withdraw()
>>
>> I will work on a solution for this.
>>
>> What kind of test were you performing when this occurred (linkup/down)?
>> Do you read link statistics periodically in your tests?
>>
>> /Partha
>>
>> On 11/21/2016 05:30 AM, John Thompson wrote:
>>
>>> Hi Partha,
>>>
>>> I was doing some some more testing today and have still observed the
>>> problem (contrary to what I had emailed earlier).
>>>
>>> Here is the kernel dump.
>>>
>>> <0>NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Pluggable
>>> Serve:2221]
>>> <6>Modules linked in: tipc jitterentropy_rng echainiv drbg
>>> platform_driver(O)
>>> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P   O
>>> <6>task: ae54ced0 ti: aec42000 task.ti: aec42000
>>> <6>NIP: 8069257c LR: c13ebf50 CTR: 80692540
>>> <6>REGS: aec43ad0 TRAP: 0901   Tainted: P   O
>>> <6>MSR: 00029002   CR: 48002444  XER: 
>>> <6>
>>> <6>GPR00: c13ea408 aec43b80 ae54ced0 a624690c  a6271d84 a39a60cc
>>> fffd
>>> <6>GPR08: aeefbbc8 0001 0001 0004 80692540
>>> <6>NIP [8069257c] _raw_spin_lock_bh+0x3c/0x70
>>> <6>LR [c13ebf50] tipc_nametbl_unsubscribe+0x50/0x120 [tipc]
>>> <6>Call Trace:
>>> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable)
>>> <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc]
>>> <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc]
>>> <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc]
>>> <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc]
>>> <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc]
>>> <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc]
>>> <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc]
>>> <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc]
>>> <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc]
>>> <6>[aec43d30] [c13f4848] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc]
>>>
>> TIPC_CMD_SHOW_LINK_STATS or TIPC_NL_LINK_GET
>>
>>> <6>[aec43d70] [804f29e0] sock_release+0x30/0xf0
>>> <6>[aec43d80] [804f2ab4] sock_close+0x14/0x30
>>> <6>[aec43d90] [80105844] __fput+0x94/0x200
>>> <6>[aec43db0] [8003dca4] task_work_run+0xd4/0x100
>>> <6>[aec43dd0] [80023620] do_exit+0x280/0x980
>>> <6>[aec43e10] [80024c48] do_group_exit+0x48/0xb0
>>> <6>[aec43e30] [80030344] get_signal+0x244/0x4f0
>>> <6>[aec43e80] [80007734] do_signal+0x34/0x1c0
>>> <6>[aec43f30] [800079a8] do_notify_resume+0x68/0x80
>>> <6>[aec43f40] [8000fa1c] do_user_signal+0x74/0xc4
>>> <6>--- interrupt: c00 at 0xf4f3d08
>>> <6>LR = 0xf4f3ce8
>>> <6>Instruction dump:
>>> <6>912a0008 3941 7d201828 2c09 40820010 7d40192d 40a2fff0
>>> 7c2004ac
>>> <6>2f89 4dbe0020 7c210b78 8123 <2f89> 40befff4 7c421378
>>> 7d201828
>>> <0>Kernel panic - not syncing: softlockup: hung tasks
>>> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P   O L
>>> <6>Call Trace:
>>> <6>[aec43930] [80694e20] dump_stack+0x84/0xb0 (unreliable)
>>> <6>[aec43940] [80692ca8] panic+0xd8/0x214
>>> <6>[aec439a0] [800a0258] 

Re: [tipc-discussion] soft lockup for TIPC

2016-11-21 Thread Jon Maloy
Hi Xiang,
Although the version you are using has the same number (I am planning to step 
it to 3.0.0 soon) as the current one in the latest kernels (4.x), it is a very 
different species indeed. Almost all code has been rewritten, and in some cases 
more than once. I would strongly suggest you upgrade to a more recent kernel if 
ever possible, as it is very hard for us to maintain TIPC in 3.x kernels. 

If that is impossible for you, we will have to look into what options we have.

Regards
///jon


> -Original Message-
> From: XIANG Haiming [mailto:haiming.xi...@alcatel-sbell.com.cn]
> Sent: Sunday, 20 November, 2016 21:35
> To: tipc-discussion@lists.sourceforge.net
> Subject: [tipc-discussion] soft lockup for TIPC
> 
> Hi all,
> 
> The version for TIPC which we use is TIPC version 2.0.0.
> The OS info is as follow:
> 
> Red Hat Enterprise Linux Server 7.2 (Maipo)
> Kernel 3.10.0-327.18.2.el7.x86_64 on an x86_64
> 
> We meet two soft lockup issue about TIPC, please help us to solve this issue.
> Thank you
> 
> One issue is as follow:
> 
> 
> [85502.601198] BUG: soft lockup - CPU#0 stuck for 22s! [scm:2649]
> [85502.603585] Modules linked in: iptable_filter ip6table_mangle xt_limit
> iptable_mangle ip6table_filter ip6_tables igb_uio(OE) uio tipc(OE) 8021q garp 
> stp
> mrp llc bonding dm_mirror dm_region_hash dm_log dm_mod ppdev
> crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
> ablk_helper cryptd pcspkr i6300esb virtio_balloon cirrus syscopyarea 
> sysfillrect
> sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core parport_pc parport
> binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache
> jbd2 virtio_blk virtio_console crct10dif_pclmul crct10dif_common crc32c_intel
> serio_raw ixgbevf virtio_pci virtio_ring virtio ata_generic pata_acpi 
> ata_piix libata
> floppy
> [85502.618210] CPU: 0 PID: 2649 Comm: scm Tainted: G   OEL 
> 
> 3.10.0-327.18.2.el7.x86_64 #1
> [85502.620482] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [85502.622361] task: 880405da5080 ti: 8800db68c000 task.ti:
> 8800db68c000
> [85502.624400] RIP: 0010:[]  []
> _raw_spin_lock_bh+0x3d/0x50
> [85502.626558] RSP: 0018:8800db68fa68  EFLAGS: 0202
> [85502.628412] RAX: 56f6 RBX: 880406600800 RCX:
> 712a
> [85502.630419] RDX: 712c RSI: 712c RDI:
> a03f8548
> [85502.632423] RBP: 8800db68fa70 R08: 02c0 R09:
> 0500
> [85502.634421] R10: 88040f001500 R11: 8800db68fc58 R12: 
> 81519be1
> [85502.636413] R13: 8800db68fa08 R14:  R15:
> 0240
> [85502.638415] FS:  () GS:88041fc0(0063)
> knlGS:d1495b40
> [85502.640514] CS:  0010 DS: 002b ES: 002b CR0: 80050033
> [85502.642399] CR2: d37c8208 CR3: 0003cfff7000 CR4:
> 000406f0
> [85502.644420] DR0:  DR1:  DR2:
> 
> [85502.646416] DR3:  DR6: 0ff0 DR7:
> 0400
> [85502.648393] Stack:
> [85502.649916]  a03f8548 8800db68fa90 a03e2afb
> 880036242c00
> [85502.652019]  880406600800 8800db68fac0 a03e5c1b
> 880036242c00
> [85502.654108]  880036b5a484 88003698e800 8800db68fb30
> 8800db68fba8
> [85502.656179] Call Trace:
> [85502.657742]  [] tipc_bearer_blocked+0x1b/0x30 [tipc]
> [85502.659678]  [] link_send_buf_fast+0x5b/0xb0 [tipc]
> [85502.661592]  [] tipc_link_send_sections_fast+0xe6/0x630
> [tipc]
> [85502.663594]  [] ? _raw_read_unlock_bh+0x16/0x20
> [85502.665473]  [] ? tipc_nametbl_translate+0xc0/0x1f0 
> [tipc]
> [85502.667441]  [] tipc_send2name+0x155/0x1d0 [tipc]
> [85502.669351]  [] send_msg+0x1eb/0x530 [tipc]
> [85502.671205]  [] sock_sendmsg+0xb0/0xf0
> [85502.673028]  [] ? futex_wait+0x193/0x280
> [85502.674859]  [] SYSC_sendto+0x121/0x1c0
> [85502.676676]  [] ? __do_page_fault+0x16d/0x450
> [85502.678540]  [] ? poll_select_copy_remaining+0x62/0x130
> [85502.680489]  [] SyS_sendto+0xe/0x10
> [85502.682277]  [] compat_sys_socketcall+0x173/0x2a0
> [85502.684193]  [] sysenter_dispatch+0x7/0x21
> [85502.686046] Code: b8 00 00 02 00 f0 0f c1 03 89 c2 c1 ea 10 66 39 c2 75 03 
> 5b 5d
> c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 0f b7 0b 66 39 ca 74 ea f3 90 <83> e8 01 
> 75 f1 48
> 89 df 66 66 66 90 66 66 90 eb e0 66 90 66 66
> [85503.372189] BUG: soft lockup - CPU#2 stuck for 23s! [nodemgr:2560]
> [85503.374181] Modules linked in: iptable_filter ip6table_mangle xt_limit
> iptable_mangle ip6table_filter ip6_tables igb_uio(OE) uio tipc(OE) 8021q garp 
> stp
> mrp llc bonding dm_mirror dm_region_hash dm_log dm_mod ppdev
> crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
> ablk_helper cryptd pcspkr i6300esb virtio_balloon cirrus syscopyarea 
> sysfillrect
> sysimgblt ttm drm_kms_helper drm i2c_piix4 i2c_core parport_pc parport
> binfmt_misc nfsd auth_rpcgss 

[tipc-discussion] [PATCH net v2 2/2] tipc: fix nametbl_lock soft lockup at module exit

2016-11-21 Thread Parthasarathy Bhuvaragan
Commit 333f796235a527 ("tipc: fix a race condition leading to
subscriber refcnt bug") reveals a soft lockup while acquiring
nametbl_lock.

Before commit 333f796235a527, we call tipc_conn_shutdown() from
tipc_close_conn() in the context of tipc_topsrv_stop(). In that
context, we are allowed to grab the nametbl_lock.

In commit 333f796235a527, i moved the tipc_conn_release (renamed from
tipc_conn_shutdown) to the connection refcount cleanup. This allows
either tipc_nametbl_withdraw() or tipc_topsrv_stop() to perform
tipc_sock_release().

Since tipc_exit_net() first calls tipc_topsrv_stop() and then
tipc_nametble_withdraw() increases the chances for the later to
perform the connection cleanup.

The soft lockup occurs in the call chain of tipc_nametbl_withdraw(),
when it performs the tipc_conn_kref_release() as it tries to grab
nametbl_lock again while holding it already.
tipc_nametbl_withdraw() grabs nametbl_lock
  tipc_nametbl_remove_publ()
tipc_subscrp_report_overlap()
  tipc_subscrp_send_event()
tipc_conn_sendmsg()
  << if (con->flags != CF_CONNECTED) we do conn_put(),
 triggering the cleanup as refcount=0. >>
  tipc_conn_kref_release
tipc_sock_release
  tipc_conn_release
tipc_subscrb_delete
  tipc_subscrp_delete
tipc_nametbl_unsubscribe << Soft Lockup >>

Until now, tipc_server_stop() grabs and releases the idr_lock twice
for every connection. Once to find the connection and second to unset
connection flag, remove it from list and decrement the refcount.
The above lockup occurs when tipc_nametbl_withdraw() grabs the
connection in between the two, thus owning the connection
and triggering the cleanup while decrementing the refcount later.

In this commit, we perform:
- call tipc_nametbl_withdraw() before tipc_topsrv_stop() to avoid:
  1. soft lockup
  2. its too late to actually notify the subscribers, as the topology
 server might already have started shutting down.
- In tipc_server_stop(), we remove all the connections from connection
list in the scope of the idr_lock to prevent any other thread finding
a connection which has unset CF_CONNECTED in its flags.

Fixes: 333f796235a52727 ("tipc: fix a race condition leading to subscriber 
refcnt bug")
Signed-off-by: Parthasarathy Bhuvaragan 
---
v2: commit message update.
cleanup tipc_server_stop() as per Ying Xue.
---
 net/tipc/core.c   |  4 
 net/tipc/net.c|  2 --
 net/tipc/server.c | 13 -
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/net/tipc/core.c b/net/tipc/core.c
index 236b043a4156..3ea1452e2f06 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -93,6 +93,10 @@ static int __net_init tipc_init_net(struct net *net)
 
 static void __net_exit tipc_exit_net(struct net *net)
 {
+   struct tipc_net *tn = net_generic(net, tipc_net_id);
+
+   tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0,
+ tn->own_addr);
tipc_topsrv_stop(net);
tipc_net_stop(net);
tipc_bcast_stop(net);
diff --git a/net/tipc/net.c b/net/tipc/net.c
index 28bf4feeb81c..92696cc6e763 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -130,8 +130,6 @@ void tipc_net_stop(struct net *net)
if (!tn->own_addr)
return;
 
-   tipc_nametbl_withdraw(net, TIPC_CFG_SRV, tn->own_addr, 0,
- tn->own_addr);
rtnl_lock();
tipc_bearer_stop(net);
tipc_node_stop(net);
diff --git a/net/tipc/server.c b/net/tipc/server.c
index 215849ce453d..54c23ecccd26 100644
--- a/net/tipc/server.c
+++ b/net/tipc/server.c
@@ -623,16 +623,19 @@ int tipc_server_start(struct tipc_server *s)
 void tipc_server_stop(struct tipc_server *s)
 {
struct tipc_conn *con;
-   int total = 0;
int id;
 
spin_lock_bh(>idr_lock);
-   for (id = 0; total < s->idr_in_use; id++) {
+   for (id = 0; s->idr_in_use; id++) {
con = idr_find(>conn_idr, id);
-   if (con) {
-   total++;
+   if (con && test_and_clear_bit(CF_CONNECTED, >flags)) {
+   idr_remove(>conn_idr, con->conid);
+   s->idr_in_use--;
+   /* release spinlock as kernel_sock_shutdown can sleep.
+*/
spin_unlock_bh(>idr_lock);
-   tipc_close_conn(con);
+   kernel_sock_shutdown(con->sock, SHUT_RDWR);
+   conn_put(con);
spin_lock_bh(>idr_lock);
}
}
-- 
2.1.4


--
___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


[tipc-discussion] [PATCH net v2 1/2] tipc: fix nametbl_lock soft lockup at node/link events

2016-11-21 Thread Parthasarathy Bhuvaragan
We trigger a soft lockup as we grab nametbl_lock twice if the node
has a pending node up/down or link up/down event while:
- we process an incoming named message in tipc_named_rcv() and
  perform an tipc_update_nametbl().
- we have pending backlog items in the name distributor queue
  during a nametable update using tipc_nametbl_publish() or
  tipc_nametbl_withdraw().

The following are the call chain associated:
tipc_named_rcv() Grabs nametbl_lock
   tipc_update_nametbl() (publish/withdraw)
 tipc_node_subscribe()/unsubscribe()
   tipc_node_write_unlock()
  << lockup occurs if an outstanding node/link event
 exits, as we grabs nametbl_lock again >>

tipc_nametbl_withdraw() Grab nametbl_lock
  tipc_named_process_backlog()
tipc_update_nametbl()
  << rest as above >>

The function tipc_node_write_unlock(), in addition to releasing the
lock processes the outstanding node/link up/down events. To do this,
we need to grab the nametbl_lock again leading to the lockup.

In this commit we fix the soft lockup by introducing a fast variant of
node_unlock(), where we just release the lock. We adapt the
node_subscribe()/node_unsubscribe() to use the fast variants.

Signed-off-by: Parthasarathy Bhuvaragan 
---
 net/tipc/node.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/tipc/node.c b/net/tipc/node.c
index 9d2f4c2b08ab..27753325e06e 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -263,6 +263,11 @@ static void tipc_node_write_lock(struct tipc_node *n)
write_lock_bh(>lock);
 }
 
+static void tipc_node_write_unlock_fast(struct tipc_node *n)
+{
+   write_unlock_bh(>lock);
+}
+
 static void tipc_node_write_unlock(struct tipc_node *n)
 {
struct net *net = n->net;
@@ -417,7 +422,7 @@ void tipc_node_subscribe(struct net *net, struct list_head 
*subscr, u32 addr)
}
tipc_node_write_lock(n);
list_add_tail(subscr, >publ_list);
-   tipc_node_write_unlock(n);
+   tipc_node_write_unlock_fast(n);
tipc_node_put(n);
 }
 
@@ -435,7 +440,7 @@ void tipc_node_unsubscribe(struct net *net, struct 
list_head *subscr, u32 addr)
}
tipc_node_write_lock(n);
list_del_init(subscr);
-   tipc_node_write_unlock(n);
+   tipc_node_write_unlock_fast(n);
tipc_node_put(n);
 }
 
-- 
2.1.4


--
___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] [PATCH net-next 3/3] tipc: reduce risk of user starvation during link congestion

2016-11-21 Thread Jon Maloy
I am having some new doubts about our current link congestion criteria. See 
below.

///jon


> -Original Message-
> From: Jon Maloy [mailto:jon.ma...@ericsson.com]
> Sent: Monday, 21 November, 2016 09:57
> To: tipc-discussion@lists.sourceforge.net; Parthasarathy Bhuvaragan
> ; Ying Xue
> ; Jon Maloy 
> Cc: ma...@donjonn.com; thompa@gmail.com
> Subject: [PATCH net-next 3/3] tipc: reduce risk of user starvation during link
> congestion
> 
> The socket code currently handles link congestion by either blocking
> and trying to send again when the congestion has abated, or just
> returning to the user with -EAGAIN and let him re-try later.
> 
> This mechanism is prone to starvation, because the wakeup algorithm is
> non-atomic. During the time the link issues a wakeup signal, until the
> socket wakes up and re-attempts sending, other senders may have come
> in between and occupied the free buffer space in the link. This in turn
> may lead to a socket having to make many send attempts before it is
> successful. In extremely loaded systems we have observed latency times
> of several seconds before a low-priority socket is able to send out a
> message.
> 
> In this commit, we simplify this mechanism and reduce the risk of the
> described scenario happening. When a message is sent to a congested
> link, we now let it keep the message in the wakeup-item that it has to
> create anyway, and immediately add it to the link's send queue when
> enough space has been freed up. Only when this is done do we issue a
> wakeup signal to the socket, which can now immediately go on and send
> the next message, if any.
> 
> The fact that a socket now can consider a message sent even when the
> link returns a congestion code means that the sending socket code can
> be simplified. Also, since this is a good opportunity to get rid of the
> obsolete 'mtu change' condition in the three socket send functions, we
> now choose to refactor those functions completely.
> 
> Signed-off-by: Jon Maloy 
> ---
>  net/tipc/bcast.c  |   2 +-
>  net/tipc/link.c   |  90 --
>  net/tipc/msg.h|   8 +-
>  net/tipc/node.c   |   2 +-
>  net/tipc/socket.c | 346 
> --
>  5 files changed, 209 insertions(+), 239 deletions(-)
> 
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index aa1babb..1a56cab 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -174,7 +174,7 @@ static void tipc_bcbase_xmit(struct net *net, struct
> sk_buff_head *xmitq)
>   *and to identified node local sockets
>   * @net: the applicable net namespace
>   * @list: chain of buffers containing message
> - * Consumes the buffer chain, except when returning -ELINKCONG
> + * Consumes the buffer chain.
>   * Returns 0 if success, otherwise errno: -ELINKCONG,-EHOSTUNREACH,-
> EMSGSIZE
>   */
>  int tipc_bcast_xmit(struct net *net, struct sk_buff_head *list)
> diff --git a/net/tipc/link.c b/net/tipc/link.c
> index 1055164..db3dcab 100644
> --- a/net/tipc/link.c
> +++ b/net/tipc/link.c
> @@ -774,60 +774,77 @@ int tipc_link_timeout(struct tipc_link *l, struct
> sk_buff_head *xmitq)
>   return rc;
>  }
> 
> +static bool tipc_link_congested(struct tipc_link *l, int imp)
> +{
> + int i;
> +
> + if (skb_queue_empty(>backlogq))
> + return false;
> +
> + /* Match msg importance against this and all higher backlog limits: */
> + for (i = imp; i <= TIPC_SYSTEM_IMPORTANCE; i++) {
> + if (unlikely(l->backlog[i].len >= l->backlog[i].limit))
> + return true;
> + }
> + return false;
> +}

After introducing this patch I ran into a problem with spontaneous and 
premature connect disconnects in the ptty test program.
After some trouble shooting I realized that with the new algorithm, high 
priority shutdown messages sometimes bypass messages which are reported 
"delivered" but still waiting in the wakeup pending queue. This was easy to 
fix, by adding a potentially blocking tipc_wait_for_cond() call in the 
__tipc_shutdown() call.

But I also realized that we may have (and always had) another starvation 
scenario, which is not addressed with this patch. Gergely's algorithm (above)  
fixes that a socket may not be starved by lower- or equal priority users, but 
this may still be caused by higher-priority users. While a lower-prio message 
is waiting in the wakeup queue, higher-prio users may drive their own levels 
into congestion, thus potentially stopping the lower-prio user to send its 
message for a long time.
After the current patch, wouldn't it be better to go back to the algorithm I 
originally suggested, where we only check the sender's own level, and nothing 
else.
At a given level, messages would now be delivered in a strict 
first-come-first-served fashion, and no starvation can occur.

What do you think?

///jon

[tipc-discussion] [PATCH net-next 3/3] tipc: reduce risk of user starvation during link congestion

2016-11-21 Thread Jon Maloy
The socket code currently handles link congestion by either blocking
and trying to send again when the congestion has abated, or just
returning to the user with -EAGAIN and let him re-try later.

This mechanism is prone to starvation, because the wakeup algorithm is
non-atomic. During the time the link issues a wakeup signal, until the
socket wakes up and re-attempts sending, other senders may have come
in between and occupied the free buffer space in the link. This in turn
may lead to a socket having to make many send attempts before it is
successful. In extremely loaded systems we have observed latency times
of several seconds before a low-priority socket is able to send out a
message.

In this commit, we simplify this mechanism and reduce the risk of the
described scenario happening. When a message is sent to a congested
link, we now let it keep the message in the wakeup-item that it has to
create anyway, and immediately add it to the link's send queue when
enough space has been freed up. Only when this is done do we issue a
wakeup signal to the socket, which can now immediately go on and send
the next message, if any.

The fact that a socket now can consider a message sent even when the
link returns a congestion code means that the sending socket code can
be simplified. Also, since this is a good opportunity to get rid of the
obsolete 'mtu change' condition in the three socket send functions, we
now choose to refactor those functions completely.

Signed-off-by: Jon Maloy 
---
 net/tipc/bcast.c  |   2 +-
 net/tipc/link.c   |  90 --
 net/tipc/msg.h|   8 +-
 net/tipc/node.c   |   2 +-
 net/tipc/socket.c | 346 --
 5 files changed, 209 insertions(+), 239 deletions(-)

diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
index aa1babb..1a56cab 100644
--- a/net/tipc/bcast.c
+++ b/net/tipc/bcast.c
@@ -174,7 +174,7 @@ static void tipc_bcbase_xmit(struct net *net, struct 
sk_buff_head *xmitq)
  *and to identified node local sockets
  * @net: the applicable net namespace
  * @list: chain of buffers containing message
- * Consumes the buffer chain, except when returning -ELINKCONG
+ * Consumes the buffer chain.
  * Returns 0 if success, otherwise errno: -ELINKCONG,-EHOSTUNREACH,-EMSGSIZE
  */
 int tipc_bcast_xmit(struct net *net, struct sk_buff_head *list)
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 1055164..db3dcab 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -774,60 +774,77 @@ int tipc_link_timeout(struct tipc_link *l, struct 
sk_buff_head *xmitq)
return rc;
 }
 
+static bool tipc_link_congested(struct tipc_link *l, int imp)
+{
+   int i;
+
+   if (skb_queue_empty(>backlogq))
+   return false;
+
+   /* Match msg importance against this and all higher backlog limits: */
+   for (i = imp; i <= TIPC_SYSTEM_IMPORTANCE; i++) {
+   if (unlikely(l->backlog[i].len >= l->backlog[i].limit))
+   return true;
+   }
+   return false;
+}
+
 /**
  * link_schedule_user - schedule a message sender for wakeup after congestion
- * @link: congested link
+ * @l: congested link
  * @list: message that was attempted sent
  * Create pseudo msg to send back to user when congestion abates
- * Does not consume buffer list
+ * Consumes buffer list
  */
-static int link_schedule_user(struct tipc_link *link, struct sk_buff_head 
*list)
+static int link_schedule_user(struct tipc_link *l, struct sk_buff_head *list)
 {
-   struct tipc_msg *msg = buf_msg(skb_peek(list));
-   int imp = msg_importance(msg);
-   u32 oport = msg_origport(msg);
-   u32 addr = tipc_own_addr(link->net);
+   struct tipc_msg *hdr = buf_msg(skb_peek(list));
+   int imp = msg_importance(hdr);
+   u32 oport = msg_origport(hdr);
+   u32 dnode = tipc_own_addr(l->net);
struct sk_buff *skb;
+   struct sk_buff_head *pkts;
 
/* This really cannot happen...  */
if (unlikely(imp > TIPC_CRITICAL_IMPORTANCE)) {
-   pr_warn("%s<%s>, send queue full", link_rst_msg, link->name);
+   pr_warn("%s<%s>, send queue full", link_rst_msg, l->name);
return -ENOBUFS;
}
-   /* Non-blocking sender: */
-   if (TIPC_SKB_CB(skb_peek(list))->wakeup_pending)
-   return -ELINKCONG;
 
/* Create and schedule wakeup pseudo message */
skb = tipc_msg_create(SOCK_WAKEUP, 0, INT_H_SIZE, 0,
- addr, addr, oport, 0, 0);
+ dnode, l->addr, oport, 0, 0);
if (!skb)
return -ENOBUFS;
-   TIPC_SKB_CB(skb)->chain_sz = skb_queue_len(list);
-   TIPC_SKB_CB(skb)->chain_imp = imp;
-   skb_queue_tail(>wakeupq, skb);
-   link->stats.link_congs++;
+   msg_set_dest_droppable(buf_msg(skb), true);
+   skb_queue_tail(>wakeupq, skb);
+
+   /* Keep the packet chain until we can send it */

[tipc-discussion] [PATCH net-next 2/3] tipc: modify struct tipc_plist to be more versatile

2016-11-21 Thread Jon Maloy
During multicast reception we currently use a simple linked list with
push/pop semantics to store port numbers.

We now see a need for a more generic list for storing values of type
u32. We therefore make some modifications to this list, while replacing
the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new
functions which will come to use in the next commits.

Signed-off-by: Jon Maloy 
---
 net/tipc/name_table.c | 100 --
 net/tipc/name_table.h |  21 ---
 net/tipc/socket.c |   8 ++--
 3 files changed, 83 insertions(+), 46 deletions(-)

diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c
index e190460..5a86df1 100644
--- a/net/tipc/name_table.c
+++ b/net/tipc/name_table.c
@@ -608,7 +608,7 @@ u32 tipc_nametbl_translate(struct net *net, u32 type, u32 
instance,
  * Returns non-zero if any off-node ports overlap
  */
 int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper,
- u32 limit, struct tipc_plist *dports)
+ u32 limit, struct list_head *dports)
 {
struct name_seq *seq;
struct sub_seq *sseq;
@@ -633,7 +633,7 @@ int tipc_nametbl_mc_translate(struct net *net, u32 type, 
u32 lower, u32 upper,
info = sseq->info;
list_for_each_entry(publ, >node_list, node_list) {
if (publ->scope <= limit)
-   tipc_plist_push(dports, publ->ref);
+   u32_push(dports, publ->ref);
}
 
if (info->cluster_list_size != info->node_list_size)
@@ -1022,40 +1022,84 @@ int tipc_nl_name_table_dump(struct sk_buff *skb, struct 
netlink_callback *cb)
return skb->len;
 }
 
-void tipc_plist_push(struct tipc_plist *pl, u32 port)
+struct u32_item {
+   struct list_head list;
+   u32 value;
+};
+
+bool u32_find(struct list_head *l, u32 value)
 {
-   struct tipc_plist *nl;
+   struct u32_item *item;
 
-   if (likely(!pl->port)) {
-   pl->port = port;
-   return;
+   list_for_each_entry(item, l, list) {
+   if (item->value == value)
+   return true;
}
-   if (pl->port == port)
-   return;
-   list_for_each_entry(nl, >list, list) {
-   if (nl->port == port)
-   return;
+   return false;
+}
+
+bool u32_push(struct list_head *l, u32 value)
+{
+   struct u32_item *item;
+
+   list_for_each_entry(item, l, list) {
+   if (item->value == value)
+   return false;
+   }
+   item = kmalloc(sizeof(*item), GFP_ATOMIC);
+   if (unlikely(!item))
+   return false;
+
+   item->value = value;
+   list_add(>list, l);
+   return true;
+}
+
+u32 u32_pop(struct list_head *l)
+{
+   struct u32_item *item;
+   u32 value = 0;
+
+   if (list_empty(l))
+   return 0;
+   item = list_first_entry(l, typeof(*item), list);
+   value = item->value;
+   list_del(>list);
+   kfree(item);
+   return value;
+}
+
+bool u32_del(struct list_head *l, u32 value)
+{
+   struct u32_item *item, *tmp;
+
+   list_for_each_entry_safe(item, tmp, l, list) {
+   if (item->value != value)
+   continue;
+   list_del(>list);
+   kfree(item);
+   return true;
}
-   nl = kmalloc(sizeof(*nl), GFP_ATOMIC);
-   if (nl) {
-   nl->port = port;
-   list_add(>list, >list);
+   return false;
+}
+
+void u32_list_purge(struct list_head *l)
+{
+   struct u32_item *item, *tmp;
+
+   list_for_each_entry_safe(item, tmp, l, list) {
+   list_del(>list);
+   kfree(item);
}
 }
 
-u32 tipc_plist_pop(struct tipc_plist *pl)
+int u32_list_len(struct list_head *l)
 {
-   struct tipc_plist *nl;
-   u32 port = 0;
+   struct u32_item *item;
+   int i = 0;
 
-   if (likely(list_empty(>list))) {
-   port = pl->port;
-   pl->port = 0;
-   return port;
+   list_for_each_entry(item, l, list) {
+   i++;
}
-   nl = list_first_entry(>list, typeof(*nl), list);
-   port = nl->port;
-   list_del(>list);
-   kfree(nl);
-   return port;
+   return i;
 }
diff --git a/net/tipc/name_table.h b/net/tipc/name_table.h
index 1524a73..c89bb3f 100644
--- a/net/tipc/name_table.h
+++ b/net/tipc/name_table.h
@@ -99,7 +99,7 @@ int tipc_nl_name_table_dump(struct sk_buff *skb, struct 
netlink_callback *cb);
 
 u32 tipc_nametbl_translate(struct net *net, u32 type, u32 instance, u32 *node);
 int tipc_nametbl_mc_translate(struct net *net, u32 type, u32 lower, u32 upper,
- u32 limit, struct tipc_plist *dports);
+ u32 limit, struct list_head 

[tipc-discussion] [PATCH net-next 1/3] tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() functions

2016-11-21 Thread Jon Maloy
The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very
similar. The latter function is also called from two locations, and
there will be more in the coming commits, which will all need to test on
different conditions.

Instead of making yet another duplicates of the function, we now
introduce a new macro tipc_wait_for_cond() where the wakeup condition
can be stated as an argument to the call. This macro replaces all
current and future uses of the two functions, which can now be
eliminated.

Signed-off-by: Jon Maloy 
---
 net/tipc/socket.c | 108 +-
 1 file changed, 49 insertions(+), 59 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 4916d8f..f084430 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -110,7 +110,6 @@ static void tipc_write_space(struct sock *sk);
 static void tipc_sock_destruct(struct sock *sk);
 static int tipc_release(struct socket *sock);
 static int tipc_accept(struct socket *sock, struct socket *new_sock, int 
flags);
-static int tipc_wait_for_sndmsg(struct socket *sock, long *timeo_p);
 static void tipc_sk_timeout(unsigned long data);
 static int tipc_sk_publish(struct tipc_sock *tsk, uint scope,
   struct tipc_name_seq const *seq);
@@ -334,6 +333,49 @@ static int tipc_set_sk_state(struct sock *sk, int state)
return res;
 }
 
+static int tipc_sk_sock_err(struct socket *sock, long *timeout)
+{
+   struct sock *sk = sock->sk;
+   int err = sock_error(sk);
+   int typ = sock->type;
+
+   if (err)
+   return err;
+   if (typ == SOCK_STREAM || typ == SOCK_SEQPACKET) {
+   if (sk->sk_state == TIPC_DISCONNECTING)
+   return -EPIPE;
+   else if (!tipc_sk_connected(sk))
+   return -ENOTCONN;
+   } else if (sk->sk_shutdown & SEND_SHUTDOWN) {
+   return -EPIPE;
+   }
+   if (!*timeout)
+   return -EAGAIN;
+   if (signal_pending(current))
+   return sock_intr_errno(*timeout);
+
+   return 0;
+}
+
+#define tipc_wait_for_cond(sock_, timeout_, condition_)
\
+   ({  \
+   int rc_ = 0;\
+   int done_ = 0;  \
+   \
+   while (!(condition_) && !done_) {   \
+   struct sock *sk_ = sock->sk;\
+   DEFINE_WAIT_FUNC(wait_, woken_wake_function);   \
+   \
+   rc_ = tipc_sk_sock_err(sock_, timeout_);\
+   if (rc_)\
+   break;  \
+   prepare_to_wait(sk_sleep(sk_), _, TASK_INTERRUPTIBLE);\
+   done_ = sk_wait_event(sk_, timeout_, (condition_), _);\
+   remove_wait_queue(sk_sleep(sk), _);\
+   }   \
+   rc_;\
+})
+
 /**
  * tipc_sk_create - create a TIPC socket
  * @net: network namespace (must be default network)
@@ -719,7 +761,7 @@ static int tipc_sendmcast(struct  socket *sock, struct 
tipc_name_seq *seq,
 
if (rc == -ELINKCONG) {
tsk->link_cong = 1;
-   rc = tipc_wait_for_sndmsg(sock, );
+   rc = tipc_wait_for_cond(sock, , !tsk->link_cong);
if (!rc)
continue;
}
@@ -828,31 +870,6 @@ static void tipc_sk_proto_rcv(struct tipc_sock *tsk, 
struct sk_buff *skb,
kfree_skb(skb);
 }
 
-static int tipc_wait_for_sndmsg(struct socket *sock, long *timeo_p)
-{
-   DEFINE_WAIT_FUNC(wait, woken_wake_function);
-   struct sock *sk = sock->sk;
-   struct tipc_sock *tsk = tipc_sk(sk);
-   int done;
-
-   do {
-   int err = sock_error(sk);
-   if (err)
-   return err;
-   if (sk->sk_shutdown & SEND_SHUTDOWN)
-   return -EPIPE;
-   if (!*timeo_p)
-   return -EAGAIN;
-   if (signal_pending(current))
-   return sock_intr_errno(*timeo_p);
-
-   add_wait_queue(sk_sleep(sk), );
-   done = sk_wait_event(sk, timeo_p, !tsk->link_cong, );
-   remove_wait_queue(sk_sleep(sk), );
-   } while (!done);
-   return 0;
-}
-
 /**
  * tipc_sendmsg - send message in connectionless manner
  * @sock: socket structure
@@ -968,7 +985,7 @@ static int 

[tipc-discussion] [PATCH net-next 0/3] tipc: improve interaction socket-link

2016-11-21 Thread Jon Maloy
We fix a very real starvation problem that may occur when the link level
runs into send buffer congestion. At the same time we make the 
interaction between the socket and link layer simpler and more 
consistent.

Jon Maloy (3):
  tipc: unify tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg()
functions
  tipc: modify struct tipc_plist to be more versatile
  tipc: reduce risk of user starvation during link congestion

 net/tipc/bcast.c  |   2 +-
 net/tipc/link.c   |  90 +-
 net/tipc/msg.h|   8 +-
 net/tipc/name_table.c | 100 +++
 net/tipc/name_table.h |  21 +--
 net/tipc/node.c   |   2 +-
 net/tipc/socket.c | 448 ++
 7 files changed, 334 insertions(+), 337 deletions(-)

-- 
2.7.4


--
___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] v4.7: soft lockup when releasing a socket

2016-11-21 Thread Parthasarathy Bhuvaragan
Hi,

There is an other branch where softlockup for nametbl_lock occurs.

tipc_named_rcv() Grabs nametbl_lock
   tipc_update_nametbl() (publish/withdraw)
 tipc_node_subscribe()/unsubscribe()
   tipc_node_write_unlock()
  << lockup occurs if it needs to process NODE UP/DOWN LINK 
UP/DOWN, as it grabs nametbl_lock again >>

/Partha

On 11/21/2016 01:04 PM, Parthasarathy Bhuvaragan wrote:
> Hi,
>
> tipc_nametbl_withdraw() triggers the softlockup as it tries to grab
> nametbl_lock twice if the node triggered a TIPC_NOTIFY_LINK_DOWN event
> while its is running. The erroneous call chain is:
>
>   tipc_nametbl_withdraw() Grab nametbl_lock
> tipc_named_process_backlog()
>   tipc_update_nametbl()
> if (dtype == WITHDRAWAL) tipc_node_unsubscribe()
>   tipc_node_write_unlock()
> if (flags & TIPC_NOTIFY_LINK_DOWN) tipc_nametbl_withdraw()
>spin_lock_bh(>nametbl_lock);  << Soft Lockup >>
>
> Three callers which can cause this under module exit:
>
> Case1:
>   tipc_exit_net()
> tipc_nametbl_withdraw() Grab nametbl_lock
>
> Case2:
>   tipc_server_stop()
> tipc_conn_kref_release
>   tipc_sock_release
> sock_release()
>   tipc_release()
> tipc_sk_withdraw()
>   tipc_nametbl_withdraw()
>
> Case3:
>   tipc_server_stop()
> tipc_conn_kref_release()
>   kernel_bind()
> tipc_bind()
>   tipc_sk_withdraw()
> tipc_nametbl_withdraw()
>
> I will work on a solution for this.
>
> What kind of test were you performing when this occurred (linkup/down)?
> Do you read link statistics periodically in your tests?
>
> /Partha
>
> On 11/21/2016 05:30 AM, John Thompson wrote:
>> Hi Partha,
>>
>> I was doing some some more testing today and have still observed the
>> problem (contrary to what I had emailed earlier).
>>
>> Here is the kernel dump.
>>
>> <0>NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Pluggable
>> Serve:2221]
>> <6>Modules linked in: tipc jitterentropy_rng echainiv drbg
>> platform_driver(O)
>> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P   O
>> <6>task: ae54ced0 ti: aec42000 task.ti: aec42000
>> <6>NIP: 8069257c LR: c13ebf50 CTR: 80692540
>> <6>REGS: aec43ad0 TRAP: 0901   Tainted: P   O
>> <6>MSR: 00029002   CR: 48002444  XER: 
>> <6>
>> <6>GPR00: c13ea408 aec43b80 ae54ced0 a624690c  a6271d84 a39a60cc
>> fffd
>> <6>GPR08: aeefbbc8 0001 0001 0004 80692540
>> <6>NIP [8069257c] _raw_spin_lock_bh+0x3c/0x70
>> <6>LR [c13ebf50] tipc_nametbl_unsubscribe+0x50/0x120 [tipc]
>> <6>Call Trace:
>> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable)
>> <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc]
>> <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc]
>> <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc]
>> <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc]
>> <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc]
>> <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc]
>> <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc]
>> <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc]
>> <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc]
>> <6>[aec43d30] [c13f4848] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc]
> TIPC_CMD_SHOW_LINK_STATS or TIPC_NL_LINK_GET
>> <6>[aec43d70] [804f29e0] sock_release+0x30/0xf0
>> <6>[aec43d80] [804f2ab4] sock_close+0x14/0x30
>> <6>[aec43d90] [80105844] __fput+0x94/0x200
>> <6>[aec43db0] [8003dca4] task_work_run+0xd4/0x100
>> <6>[aec43dd0] [80023620] do_exit+0x280/0x980
>> <6>[aec43e10] [80024c48] do_group_exit+0x48/0xb0
>> <6>[aec43e30] [80030344] get_signal+0x244/0x4f0
>> <6>[aec43e80] [80007734] do_signal+0x34/0x1c0
>> <6>[aec43f30] [800079a8] do_notify_resume+0x68/0x80
>> <6>[aec43f40] [8000fa1c] do_user_signal+0x74/0xc4
>> <6>--- interrupt: c00 at 0xf4f3d08
>> <6>LR = 0xf4f3ce8
>> <6>Instruction dump:
>> <6>912a0008 3941 7d201828 2c09 40820010 7d40192d 40a2fff0 7c2004ac
>> <6>2f89 4dbe0020 7c210b78 8123 <2f89> 40befff4 7c421378
>> 7d201828
>> <0>Kernel panic - not syncing: softlockup: hung tasks
>> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P   O L
>> <6>Call Trace:
>> <6>[aec43930] [80694e20] dump_stack+0x84/0xb0 (unreliable)
>> <6>[aec43940] [80692ca8] panic+0xd8/0x214
>> <6>[aec439a0] [800a0258] watchdog_timer_fn+0x2d8/0x2e0
>> <6>[aec439f0] [8007ae58] __hrtimer_run_queues+0x118/0x1d0
>> <6>[aec43a30] [8007b608] hrtimer_interrupt+0xd8/0x270
>> <6>[aec43a80] [8000983c] __timer_interrupt+0xac/0x1b0
>> <6>[aec43aa0] [80009b70] timer_interrupt+0xb0/0xe0
>> <6>[aec43ac0] [8000f450] ret_from_except+0x0/0x18
>> <6>--- interrupt: 901 at _raw_spin_lock_bh+0x3c/0x70
>> <6>LR = tipc_nametbl_unsubscribe+0x50/0x120 [tipc]
>> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable)
>> <6>[aec43ba0] [c13ea408] 

Re: [tipc-discussion] v4.7: soft lockup when releasing a socket

2016-11-21 Thread Parthasarathy Bhuvaragan
Hi,

tipc_nametbl_withdraw() triggers the softlockup as it tries to grab 
nametbl_lock twice if the node triggered a TIPC_NOTIFY_LINK_DOWN event 
while its is running. The erroneous call chain is:

  tipc_nametbl_withdraw() Grab nametbl_lock
tipc_named_process_backlog()
  tipc_update_nametbl()
if (dtype == WITHDRAWAL) tipc_node_unsubscribe()
  tipc_node_write_unlock()
if (flags & TIPC_NOTIFY_LINK_DOWN) tipc_nametbl_withdraw()
   spin_lock_bh(>nametbl_lock);  << Soft Lockup >>

Three callers which can cause this under module exit:

Case1:
  tipc_exit_net()
tipc_nametbl_withdraw() Grab nametbl_lock

Case2:
  tipc_server_stop()
tipc_conn_kref_release
  tipc_sock_release
sock_release()
  tipc_release()
tipc_sk_withdraw()
  tipc_nametbl_withdraw()

Case3:
  tipc_server_stop()
tipc_conn_kref_release()
  kernel_bind()
tipc_bind()
  tipc_sk_withdraw()
tipc_nametbl_withdraw()

I will work on a solution for this.

What kind of test were you performing when this occurred (linkup/down)?
Do you read link statistics periodically in your tests?

/Partha

On 11/21/2016 05:30 AM, John Thompson wrote:
> Hi Partha,
>
> I was doing some some more testing today and have still observed the
> problem (contrary to what I had emailed earlier).
>
> Here is the kernel dump.
>
> <0>NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [Pluggable
> Serve:2221]
> <6>Modules linked in: tipc jitterentropy_rng echainiv drbg
> platform_driver(O)
> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P   O
> <6>task: ae54ced0 ti: aec42000 task.ti: aec42000
> <6>NIP: 8069257c LR: c13ebf50 CTR: 80692540
> <6>REGS: aec43ad0 TRAP: 0901   Tainted: P   O
> <6>MSR: 00029002   CR: 48002444  XER: 
> <6>
> <6>GPR00: c13ea408 aec43b80 ae54ced0 a624690c  a6271d84 a39a60cc
> fffd
> <6>GPR08: aeefbbc8 0001 0001 0004 80692540
> <6>NIP [8069257c] _raw_spin_lock_bh+0x3c/0x70
> <6>LR [c13ebf50] tipc_nametbl_unsubscribe+0x50/0x120 [tipc]
> <6>Call Trace:
> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable)
> <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc]
> <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc]
> <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc]
> <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc]
> <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc]
> <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc]
> <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc]
> <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc]
> <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc]
> <6>[aec43d30] [c13f4848] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc]
TIPC_CMD_SHOW_LINK_STATS or TIPC_NL_LINK_GET
> <6>[aec43d70] [804f29e0] sock_release+0x30/0xf0
> <6>[aec43d80] [804f2ab4] sock_close+0x14/0x30
> <6>[aec43d90] [80105844] __fput+0x94/0x200
> <6>[aec43db0] [8003dca4] task_work_run+0xd4/0x100
> <6>[aec43dd0] [80023620] do_exit+0x280/0x980
> <6>[aec43e10] [80024c48] do_group_exit+0x48/0xb0
> <6>[aec43e30] [80030344] get_signal+0x244/0x4f0
> <6>[aec43e80] [80007734] do_signal+0x34/0x1c0
> <6>[aec43f30] [800079a8] do_notify_resume+0x68/0x80
> <6>[aec43f40] [8000fa1c] do_user_signal+0x74/0xc4
> <6>--- interrupt: c00 at 0xf4f3d08
> <6>LR = 0xf4f3ce8
> <6>Instruction dump:
> <6>912a0008 3941 7d201828 2c09 40820010 7d40192d 40a2fff0 7c2004ac
> <6>2f89 4dbe0020 7c210b78 8123 <2f89> 40befff4 7c421378
> 7d201828
> <0>Kernel panic - not syncing: softlockup: hung tasks
> <6>CPU: 0 PID: 2221 Comm: Pluggable Serve Tainted: P   O L
> <6>Call Trace:
> <6>[aec43930] [80694e20] dump_stack+0x84/0xb0 (unreliable)
> <6>[aec43940] [80692ca8] panic+0xd8/0x214
> <6>[aec439a0] [800a0258] watchdog_timer_fn+0x2d8/0x2e0
> <6>[aec439f0] [8007ae58] __hrtimer_run_queues+0x118/0x1d0
> <6>[aec43a30] [8007b608] hrtimer_interrupt+0xd8/0x270
> <6>[aec43a80] [8000983c] __timer_interrupt+0xac/0x1b0
> <6>[aec43aa0] [80009b70] timer_interrupt+0xb0/0xe0
> <6>[aec43ac0] [8000f450] ret_from_except+0x0/0x18
> <6>--- interrupt: 901 at _raw_spin_lock_bh+0x3c/0x70
> <6>LR = tipc_nametbl_unsubscribe+0x50/0x120 [tipc]
> <6>[aec43b80] [800fa258] check_object+0xc8/0x270 (unreliable)
> <6>[aec43ba0] [c13ea408] tipc_named_reinit+0xf8/0x820 [tipc]
> <6>[aec43bb0] [c13ea6c0] tipc_named_reinit+0x3b0/0x820 [tipc]
> <6>[aec43bd0] [c13f7bbc] tipc_nl_publ_dump+0x50c/0xed0 [tipc]
> <6>[aec43c00] [c13f865c] tipc_conn_sendmsg+0xdc/0x170 [tipc]
> <6>[aec43c30] [c13eacbc] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc]
> <6>[aec43c70] [c13eb27c] tipc_topsrv_stop+0x45c/0x4f0 [tipc]
> <6>[aec43ca0] [c13eb7a8] tipc_nametbl_remove_publ+0x58/0x110 [tipc]
> <6>[aec43cd0] [c13ebc68] tipc_nametbl_withdraw+0x68/0x140 [tipc]
> <6>[aec43d00] [c13f3c34] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc]
>