Re: [tipc-discussion] [PATCH net-next v4 0/6] topology server fixes for nametable soft lockup

2017-01-17 Thread John Thompson
Hi Partha,

Thanks for the new patches.  I have tested them and had a kernel warning as
per below.  This is running on a SMP system with 4 cores.
The warning reads as though one thread has gone to free the item while
another thread has gotten a reference to it.
The suggestion is to use kref_get_unless_zero() instead of kref_get().

[ cut here ]
WARNING: at /home/johnt/views/main/linux/include/linux/kref.h:46
Modules linked in: tipc jitterentropy_rng echainiv drbg platform_driver(O)
CPU: 1 PID: 1527 Comm: intrTokenReset Tainted: P   O
task: a52c8600 ti: a4c98000 task.ti: a4c98000
NIP: c161809c LR: c1618120 CTR: 80695270
REGS: a4c99b30 TRAP: 0700   Tainted: P   O
MSR: 00029002   CR: 28002482  XER: 

GPR00: c1618694 a4c99be0 a52c8600 a4c4e840 a2763aa0  a4ab623c
0030
GPR08: 01001005 0001 c162 0005 80695270 107d12f0 6c23e000
0009
GPR16:  808e 808e 00040100 00040006 807c9428 808f
a2e39ac0
GPR24: 808dba00 01001005 a4ab623c a50fb300 0030 a50fb31c a50fb300
a4c4e840
NIP [c161809c] tipc_nl_publ_dump+0x93c/0xf10 [tipc]
LR [c1618120] tipc_nl_publ_dump+0x9c0/0xf10 [tipc]
Call Trace:
[a4c99be0] [800b9e2c] free_pages_prepare+0x18c/0x2a0 (unreliable)
[a4c99c00] [c1618694] tipc_conn_sendmsg+0x24/0x150 [tipc]
[a4c99c30] [c160ad5c] tipc_subscrp_report_overlap+0xbc/0xd0 [tipc]
[a4c99c70] [c160b31c] tipc_topsrv_stop+0x45c/0x4f0 [tipc]
[a4c99ca0] [c160b848] tipc_nametbl_remove_publ+0x58/0x110 [tipc]
[a4c99cd0] [c160bd08] tipc_nametbl_withdraw+0x68/0x140 [tipc]
[a4c99d00] [c1613ce4] tipc_nl_node_dump_link+0x1904/0x45d0 [tipc]
[a4c99d30] [c16148f8] tipc_nl_node_dump_link+0x2518/0x45d0 [tipc]
[a4c99d70] [804f5a40] sock_release+0x30/0xf0
[a4c99d80] [804f5b14] sock_close+0x14/0x30
[a4c99d90] [80105844] __fput+0x94/0x200
[a4c99db0] [8003dca4] task_work_run+0xd4/0x100
[a4c99dd0] [80023620] do_exit+0x280/0x980
[a4c99e10] [80024c48] do_group_exit+0x48/0xb0
[a4c99e30] [80030344] get_signal+0x244/0x4f0
[a4c99e80] [80007734] do_signal+0x34/0x1c0
[a4c99f30] [800079a8] do_notify_resume+0x68/0x80
[a4c99f40] [8000fa1c] do_user_signal+0x74/0xc4
--- interrupt: c00 at 0xf5b0cfc
LR = 0xf5b0ce8
Instruction dump:
4ba8 7c0004ac 7d201828 31290001 7d20192d 40a2fff4 7c0004ac 2f890001
4dbd0020 3d40c162 892ac11e 69290001 <0f09> 2f89 4dbe0020 3921
---[ end trace 544bc785f9258108 ]---

JT


On Thu, Jan 12, 2017 at 1:19 AM, Parthasarathy Bhuvaragan <
parthasarathy.bhuvara...@ericsson.com> wrote:

> In this series, we revert the commit 333f796235a527 ("tipc: fix a
> race condition leading to subscriber refcnt bug") and provide an
> alternate solution to fix the race conditions in commits 2-4.
>
> We have to do this as the above commit introduced a nametbl soft
> lockup at module exit as described by patch#4.
>
> ---
> v3: introduce cleanup workqueue to fix nametbl soft lockup.
>
> Parthasarathy Bhuvaragan (6):
>   tipc: fix nametbl_lock soft lockup at node/link events
>   tipc: add subscription refcount
>   tipc: fix connection refcount error
>   tipc: fix nametbl_lock soft lockup at module exit
>   tipc: ignore requests when the connection state is not CONNECTED
>   tipc: fix cleanup at module unload
>
>  net/tipc/node.c   |   9 -
>  net/tipc/server.c |  44 +
>  net/tipc/subscr.c | 116 ++
> ++--
>  net/tipc/subscr.h |   1 +
>  4 files changed, 94 insertions(+), 76 deletions(-)
>
> --
> 2.1.4
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion


Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast as transport option for multicast

2017-01-17 Thread Xue, Ying
Jon, I remembered I had ever given an ack for the series. Probably i forgot to 
send my ack. Anyway, I had spent time looking at all patches, as a result, no 
any issue was found during the review. This is very nice indeed. Thanks!

Sent from my iPhone

> On 17 Jan 2017, at 11:06 PM, Jon Maloy  wrote:
> 
> Thanks Partha. Any viewpoints form Ying? Otherwise I'll send it in tomorrow.
> 
> ///jon
> 
> 
>> -Original Message-
>> From: Parthasarathy Bhuvaragan
>> Sent: Tuesday, 17 January, 2017 09:35
>> To: Jon Maloy ; 
>> tipc-discussion@lists.sourceforge.net;
>> Ying Xue 
>> Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast as
>> transport option for multicast
>> 
>>> On 01/16/2017 03:13 PM, Jon Maloy wrote:
>>> 
>>> 
 -Original Message-
 From: Parthasarathy Bhuvaragan
 Sent: Monday, 16 January, 2017 05:20
 To: Jon Maloy ; tipc-
>> discuss...@lists.sourceforge.net;
 Ying Xue 
 Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast 
 as
 transport option for multicast
 
> On 01/13/2017 04:18 PM, Jon Maloy wrote:
> 
> 
>> -Original Message-
>> From: Parthasarathy Bhuvaragan
>> Sent: Friday, 13 January, 2017 04:24
>> To: Jon Maloy ; tipc-
 discuss...@lists.sourceforge.net;
>> Ying Xue 
>> Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce 
>> replicast as
>> transport option for multicast
>> 
>>> On 01/04/2017 06:05 PM, Parthasarathy Bhuvaragan wrote:
>>> Hi Jon,
>>> 
>>> Added some minor comments inline in this patch, apart from that the
>>> major concern is the following:
>>> 
>>> All my tests which passed before this patch, fails while sending
>>> multicast to a receiver on own node.
>>> 
>>> With this patch, we increase the likelyhood of receive buffer overflow
>>> if the sender & receivers are running on the same host as we bypass the
>>> link layer completely. I confirmed this with some traces in 
>>> filter_rcv().
>>> 
>>> If I add another multicast listener running on another node, this
>>> pacifies the sender (put the sender to sleep at link congestion) and
>>> relatively slow link layer reduces the buffer overflow.
>>> 
>>> We need to find a way reduce the aggressiveness of the sender.
>>> We want users to be transparent about the location of the services, so
>>> we should to provide similar charecteristics regardless of the service
>>> location.
>>> 
>> Jon, running ptts sever and client on a standalone node without your
>> updates failed. So in that aspect, iam ok with this patch.
>> 
>> If the ethernet bearer lacks broadcast ability, then neighbor discovery
>> will not work. So do we intend to introduce support to add ethernet
>> peers manually as we do for udp bearers? otherwise we can never use
>> replicast for non udp bearers.
> 
> I believe all Ethernet implementations, even overlay networks, provide
>> some
 form of broadcast, or in lack thereof, an emulated broadcast.
> So, discovery should work, but it will be very inefficient when we do link
 broadcast, because tipc will think that genuine Ethernet broadcast is
>> supported.
> We actually need some way to find out what kind of "Ethernet" we are
 attached to, e.g. VXLAN, so that the "bcast supported" flag  can be set
>> correctly.
> I wonder if that if possible, or if it has to be configured.
> 
 I assumed that, but thanks for the clarification. I infer from your
 statement that its the User, who shall configure this per socket in case
 tipc is running over some kind of overlay networks. Tipc has no
 knowledge about the tunnel mechanisms used under the exposed bearers.
>>> 
>>> No, I don't think that is a good option. I think it will be in only very 
>>> special cases
>> the user will want to enforce replicast, (e.g., security, or if he know 
>> there will
>> always be very few destinations), and those would not be related to the
>> deployment.
>>> 
 
 In that case, its more logical if we set this as a node attribute and
 the sockets inherit them?
 Since the applications itself usually are agnostic of the deployment
 environments, it's not correct to place the platform specific logic
 inside the application.
>>> 
>>> I was thinking more of setting this as a configured bearer attribute, 
>>> unless we
>> find a way to deduce it from the interface attributes. I just didn't see 
>> this as part
>> of this series, but it should be done.
>>> 
 
 I think you need to rephrase the commit message. I read the commit
 message as if we are introducing a complete link level broadcast service
 replacement. But 

Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast as transport option for multicast

2017-01-17 Thread Jon Maloy
Thanks Partha. Any viewpoints form Ying? Otherwise I'll send it in tomorrow.

///jon


> -Original Message-
> From: Parthasarathy Bhuvaragan
> Sent: Tuesday, 17 January, 2017 09:35
> To: Jon Maloy ; tipc-discussion@lists.sourceforge.net;
> Ying Xue 
> Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast as
> transport option for multicast
> 
> On 01/16/2017 03:13 PM, Jon Maloy wrote:
> >
> >
> >> -Original Message-
> >> From: Parthasarathy Bhuvaragan
> >> Sent: Monday, 16 January, 2017 05:20
> >> To: Jon Maloy ; tipc-
> discuss...@lists.sourceforge.net;
> >> Ying Xue 
> >> Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast 
> >> as
> >> transport option for multicast
> >>
> >> On 01/13/2017 04:18 PM, Jon Maloy wrote:
> >>>
> >>>
>  -Original Message-
>  From: Parthasarathy Bhuvaragan
>  Sent: Friday, 13 January, 2017 04:24
>  To: Jon Maloy ; tipc-
> >> discuss...@lists.sourceforge.net;
>  Ying Xue 
>  Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce 
>  replicast as
>  transport option for multicast
> 
>  On 01/04/2017 06:05 PM, Parthasarathy Bhuvaragan wrote:
> > Hi Jon,
> >
> > Added some minor comments inline in this patch, apart from that the
> > major concern is the following:
> >
> > All my tests which passed before this patch, fails while sending
> > multicast to a receiver on own node.
> >
> > With this patch, we increase the likelyhood of receive buffer overflow
> > if the sender & receivers are running on the same host as we bypass the
> > link layer completely. I confirmed this with some traces in 
> > filter_rcv().
> >
> > If I add another multicast listener running on another node, this
> > pacifies the sender (put the sender to sleep at link congestion) and
> > relatively slow link layer reduces the buffer overflow.
> >
> > We need to find a way reduce the aggressiveness of the sender.
> > We want users to be transparent about the location of the services, so
> > we should to provide similar charecteristics regardless of the service
> > location.
> >
>  Jon, running ptts sever and client on a standalone node without your
>  updates failed. So in that aspect, iam ok with this patch.
> 
>  If the ethernet bearer lacks broadcast ability, then neighbor discovery
>  will not work. So do we intend to introduce support to add ethernet
>  peers manually as we do for udp bearers? otherwise we can never use
>  replicast for non udp bearers.
> >>>
> >>> I believe all Ethernet implementations, even overlay networks, provide
> some
> >> form of broadcast, or in lack thereof, an emulated broadcast.
> >>> So, discovery should work, but it will be very inefficient when we do link
> >> broadcast, because tipc will think that genuine Ethernet broadcast is
> supported.
> >>> We actually need some way to find out what kind of "Ethernet" we are
> >> attached to, e.g. VXLAN, so that the "bcast supported" flag  can be set
> correctly.
> >>> I wonder if that if possible, or if it has to be configured.
> >>>
> >> I assumed that, but thanks for the clarification. I infer from your
> >> statement that its the User, who shall configure this per socket in case
> >> tipc is running over some kind of overlay networks. Tipc has no
> >> knowledge about the tunnel mechanisms used under the exposed bearers.
> >
> > No, I don't think that is a good option. I think it will be in only very 
> > special cases
> the user will want to enforce replicast, (e.g., security, or if he know there 
> will
> always be very few destinations), and those would not be related to the
> deployment.
> >
> >>
> >> In that case, its more logical if we set this as a node attribute and
> >> the sockets inherit them?
> >> Since the applications itself usually are agnostic of the deployment
> >> environments, it's not correct to place the platform specific logic
> >> inside the application.
> >
> > I was thinking more of setting this as a configured bearer attribute, 
> > unless we
> find a way to deduce it from the interface attributes. I just didn't see this 
> as part
> of this series, but it should be done.
> >
> >>
> >> I think you need to rephrase the commit message. I read the commit
> >> message as if we are introducing a complete link level broadcast service
> >> replacement. But that's incorrect, as still broadcast link for neighbor
> >> discovery.
> >
> > Yes. This is only an optimization for broadcast traffic, nit neighbor 
> > discovery.
> >
> Thanks Jon for the clarification. You may add my tag for the entire
> series if you want to.
> Reviewed-by: Parthasarathy Bhuvaragan
> 
> 
> /Partha
> > ///jon
> >
> >>
> >> Thanks 

Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast as transport option for multicast

2017-01-17 Thread Parthasarathy Bhuvaragan
On 01/16/2017 03:13 PM, Jon Maloy wrote:
>
>
>> -Original Message-
>> From: Parthasarathy Bhuvaragan
>> Sent: Monday, 16 January, 2017 05:20
>> To: Jon Maloy ; 
>> tipc-discussion@lists.sourceforge.net;
>> Ying Xue 
>> Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast as
>> transport option for multicast
>>
>> On 01/13/2017 04:18 PM, Jon Maloy wrote:
>>>
>>>
 -Original Message-
 From: Parthasarathy Bhuvaragan
 Sent: Friday, 13 January, 2017 04:24
 To: Jon Maloy ; tipc-
>> discuss...@lists.sourceforge.net;
 Ying Xue 
 Subject: Re: [tipc-discussion] [net-next v3 3/4] tipc: introduce replicast 
 as
 transport option for multicast

 On 01/04/2017 06:05 PM, Parthasarathy Bhuvaragan wrote:
> Hi Jon,
>
> Added some minor comments inline in this patch, apart from that the
> major concern is the following:
>
> All my tests which passed before this patch, fails while sending
> multicast to a receiver on own node.
>
> With this patch, we increase the likelyhood of receive buffer overflow
> if the sender & receivers are running on the same host as we bypass the
> link layer completely. I confirmed this with some traces in filter_rcv().
>
> If I add another multicast listener running on another node, this
> pacifies the sender (put the sender to sleep at link congestion) and
> relatively slow link layer reduces the buffer overflow.
>
> We need to find a way reduce the aggressiveness of the sender.
> We want users to be transparent about the location of the services, so
> we should to provide similar charecteristics regardless of the service
> location.
>
 Jon, running ptts sever and client on a standalone node without your
 updates failed. So in that aspect, iam ok with this patch.

 If the ethernet bearer lacks broadcast ability, then neighbor discovery
 will not work. So do we intend to introduce support to add ethernet
 peers manually as we do for udp bearers? otherwise we can never use
 replicast for non udp bearers.
>>>
>>> I believe all Ethernet implementations, even overlay networks, provide some
>> form of broadcast, or in lack thereof, an emulated broadcast.
>>> So, discovery should work, but it will be very inefficient when we do link
>> broadcast, because tipc will think that genuine Ethernet broadcast is 
>> supported.
>>> We actually need some way to find out what kind of "Ethernet" we are
>> attached to, e.g. VXLAN, so that the "bcast supported" flag  can be set 
>> correctly.
>>> I wonder if that if possible, or if it has to be configured.
>>>
>> I assumed that, but thanks for the clarification. I infer from your
>> statement that its the User, who shall configure this per socket in case
>> tipc is running over some kind of overlay networks. Tipc has no
>> knowledge about the tunnel mechanisms used under the exposed bearers.
>
> No, I don't think that is a good option. I think it will be in only very 
> special cases the user will want to enforce replicast, (e.g., security, or if 
> he know there will always be very few destinations), and those would not be 
> related to the deployment.
>
>>
>> In that case, its more logical if we set this as a node attribute and
>> the sockets inherit them?
>> Since the applications itself usually are agnostic of the deployment
>> environments, it's not correct to place the platform specific logic
>> inside the application.
>
> I was thinking more of setting this as a configured bearer attribute, unless 
> we find a way to deduce it from the interface attributes. I just didn't see 
> this as part of this series, but it should be done.
>
>>
>> I think you need to rephrase the commit message. I read the commit
>> message as if we are introducing a complete link level broadcast service
>> replacement. But that's incorrect, as still broadcast link for neighbor
>> discovery.
>
> Yes. This is only an optimization for broadcast traffic, nit neighbor 
> discovery.
>
Thanks Jon for the clarification. You may add my tag for the entire 
series if you want to.
Reviewed-by: Parthasarathy Bhuvaragan 


/Partha
> ///jon
>
>>
>> Thanks for your patience, probably its just me getting confused with the
>> purpose of the change-sets.
>>
>> /Partha
>>
>>> ///jon
>>>

 /Partha

> /Partha
>
> On 01/02/2017 03:34 PM, Parthasarathy Bhuvaragan wrote:
>> Hi jon,
>>
>> When I include this patch, ptts case 12 (multicast) fails when the
>> client and server are running on the same node.
>>
>> /Partha
>>
>> On 12/22/2016 04:15 PM, Jon Maloy wrote:
>>> TIPC multicast messages are currently carried over a reliable
>>> 'broadcast link', making use of the underlying media's ability to
>>> transport packets as L2