Hi Jon,

 

Thanks for clarification!

I will update the fix and upload again.

 

Thanks,

Hoang

From: Jon Maloy <[email protected]> 
Sent: Monday, November 19, 2018 6:56 PM
To: Hoang Huu Le <[email protected]>; 
[email protected]
Cc: [email protected]; [email protected]
Subject: RE: [tipc-discussion] [net-next] tipc: fix node keep alive interval 
calculation

 

See below

 

From: Hoang Le <[email protected] <mailto:[email protected]> > 
Sent: 18-Nov-18 22:01
To: Jon Maloy <[email protected] <mailto:[email protected]> >; 
[email protected] 
<mailto:[email protected]> 
Cc: [email protected] <mailto:[email protected]> ; [email protected] 
<mailto:[email protected]> 
Subject: RE: [tipc-discussion] [net-next] tipc: fix node keep alive interval 
calculation

 

Hi Jon,

 

"Just set n->keepalive_intv to a high value, e.g., 10000 ms, before you enter 
the loop.

Then, in any iteration where a lower value is calculated (i.e., inclusive the 
first one) n->keepalive_intv will be updated accordingly."

 

Yes, I was thinking about setting this value back to maximum U32_MAX. But there 
are 2 things need to be considered: 

 

1. Need to take another lock_bh before setting the value

Example:

tipc_node_write_lock(n)

n->keepalive_intv = U32_MAX or 10000ms;

tipc_node_write_unlock(n)

 

[jon] You don’t need to take the lock in this case. The node is reference 
counted, and nobody else but this time will touch it anyway. So, no risk for 
any race.

 

2. If node has one link and its being deleted (le->link == NULL),  
n->keepalive_intv will never re-calculate again. 

Then, it will cause to node timeout callback expired in next 10000ms.

Is it reasonable for this case?

 

[jon] Yes, I see no problem with that. On the contrary, - we save cpu cycles 
for an unused link (which will anyway be automatically deleted after 5 minutes).

 

Regards

///jon

 

 

Regards,

Hoang

-----Original Message-----

From: Jon Maloy <[email protected] <mailto:[email protected]> > 

Sent: Sunday, November 18, 2018 3:52 AM

To: Hoang Huu Le <[email protected] <mailto:[email protected]> 
>; [email protected] 
<mailto:[email protected]> ; [email protected] 
<mailto:[email protected]> ; [email protected] 
<mailto:[email protected]> 

Subject: RE: [tipc-discussion] [net-next] tipc: fix node keep alive interval 
calculation

 

Hi Hoang,

See below.

 

 

> -----Original Message-----

> From: Hoang Le <[email protected] <mailto:[email protected]> >

> Sent: 15-Nov-18 04:17

> To: [email protected] 
> <mailto:[email protected]> ; Jon Maloy

> <[email protected] <mailto:[email protected]> >; [email protected] 
> <mailto:[email protected]> ; [email protected] 
> <mailto:[email protected]> 

> Subject: RE: [tipc-discussion] [net-next] tipc: fix node keep alive interval

> calculation

> 

> Hi Jon,

> 

> I think the issue now is not minor as I concluded. Support we have a huge

> cluster 75 VMS node, this issue will caused traffic performance degradation.

> So, please take time to review the patch.

> 

> Thanks,

> Hoang

> -----Original Message-----

> From: Hoang Le <[email protected] <mailto:[email protected]> >

> Sent: Monday, November 5, 2018 3:29 PM

> To: [email protected] 
> <mailto:[email protected]> ; [email protected] 
> <mailto:[email protected]> ;

> [email protected] <mailto:[email protected]> ; [email protected] 
> <mailto:[email protected]> 

> Subject: [tipc-discussion] [net-next] tipc: fix node keep alive interval

> calculation

> 

> When setting LINK tolerance, node timer interval will be calculated base on

> the LINK with lowest tolerance.

> 

> But when calculated, the old node timer interval only updated if current

> setting value (tolerance/4) less than old ones regardless of number of links 
> as

> well as links' lowest tolerance value.

> 

> This caused to two cases missing if tolerance changed as following:

> Case 1:

> 1.1/ There is one link (L1) available in the system 1.2/ Set L1's tolerance 
> from

> 1500ms => lower (i.e 500ms) 1.3/ Then, fallback to default (1500ms) or higher

> (i.e 2000ms)

> 

> Expected:

>     node timer interval is 1500/4=375ms after 1.3

> 

> Result:

> node timer interval will not being updated after changing tolerance at 1.3

> since its value 1500/4=375ms is not less than 500/4=125ms at 1.2.

> 

Yes, this is not good.

 

> Case 2:

> 2.1/ There are two links (L1, L2) available in the system 2.2/ L1 and L2

> tolerance value are 2000ms as initial 2.3/ Set L2's tolerance from 2000ms =>

> lower 1500ms 2.4/ Disable link L2 (bring down its bearer)

 

If we take down the bearer L2 should be deleted, and its value ignored during 
further calculations.

 

> 

> Expected:

>     node timer interval is 2000ms/4=500ms after 2.4

> 

Similar issue as above.

 

> Result:

> node timer interval will not being updated after disabling L2 since its value

> 2000ms/4=500ms is still not less than 1500/4=375ms at 2.3 although L2 is

> already not available in the system.

> 

> To fix this, we will update node timer interval according to link tolerance

> value setting if there is only one link available and recalculate the link 
> with

> lowest tolerance for each iteration checking.

 

I think you can do this much simpler:

Just set n->keepalive_intv to a high value, e.g., 10000 ms, before you enter 
the loop.

Then, in any iteration where a lower value is calculated (i.e., inclusive the 
first one) n->keepalive_intv will be updated accordingly.

No more changes needed 😉

 

Regards

///jon

 

> 

> Signed-off-by: Hoang Le <[email protected] 
> <mailto:[email protected]> >

> ---

>  net/tipc/node.c | 12 ++++++++----

>  1 file changed, 8 insertions(+), 4 deletions(-)

> 

> diff --git a/net/tipc/node.c b/net/tipc/node.c index

> 2afc4f8c37a7..a0e5b2e2720d 100644

> --- a/net/tipc/node.c

> +++ b/net/tipc/node.c

> @@ -437,13 +437,14 @@ static struct tipc_node *tipc_node_create(struct

> net *net, u32 addr,

>             return n;

>  }

> 

> -static void tipc_node_calculate_timer(struct tipc_node *n, struct tipc_link

> *l)

> +static void tipc_node_calculate_timer(struct tipc_node *n, struct tipc_link

> *l,

> +                                                                bool 
> first_link)

>  {

>             unsigned long tol = tipc_link_tolerance(l);

>             unsigned long intv = ((tol / 4) > 500) ? 500 : tol / 4;

> 

>             /* Link with lowest tolerance determines timer interval */

> -           if (intv < n->keepalive_intv)

> +          if (intv < n->keepalive_intv || first_link || n->link_cnt == 1)

>                             n->keepalive_intv = intv;

> 

>             /* Ensure link's abort limit corresponds to current tolerance */ 
> @@ -

> 627,7 +628,9 @@ static void tipc_node_timeout(struct timer_list *t)

>                             if (le->link) {

>                                             spin_lock_bh(&le->lock);

>                                             /* Link tolerance may change 
> asynchronously: */

> -                                           tipc_node_calculate_timer(n, 
> le->link);

> +                                          tipc_node_calculate_timer(n, 
> le->link,

> +                                                                             
>                (bearer_id == 0) ?

> +                                                                             
>                true : false);

>                                             rc = tipc_link_timeout(le->link, 
> &xmitq);

>                                             spin_unlock_bh(&le->lock);

>                                             remains--;

> @@ -1018,7 +1021,8 @@ void tipc_node_check_dest(struct net *net, u32

> addr,

>                                             tipc_link_fsm_evt(l, 
> LINK_FAILOVER_BEGIN_EVT);

>                             le->link = l;

>                             n->link_cnt++;

> -                           tipc_node_calculate_timer(n, l);

> +                          tipc_node_calculate_timer(n, l, n->link_cnt == 1 ?

> +                                                                            
> true : false);

>                             if (n->link_cnt == 1) {

>                                             intv = jiffies + 
> msecs_to_jiffies(n->keepalive_intv);

>                                             if (!mod_timer(&n->timer, intv))

> --

> 2.17.1

> 

> 

> 

> _______________________________________________

> tipc-discussion mailing list

> [email protected] 
> <mailto:[email protected]> 

> https://lists.sourceforge.net/lists/listinfo/tipc-discussion

 


_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to