Hi Jon,

 

"Just set n->keepalive_intv to a high value, e.g., 10000 ms, before you enter 
the loop.

Then, in any iteration where a lower value is calculated (i.e., inclusive the 
first one) n->keepalive_intv will be updated accordingly."

 

Yes, I was thinking about setting this value back to maximum U32_MAX. But there 
are 2 things need to be considered: 

 

1. Need to take another lock_bh before setting the value

Example:

tipc_node_write_lock(n)

n->keepalive_intv = U32_MAX or 10000ms;

tipc_node_write_unlock(n)

 

2. If node has one link and its being deleted (le->link == NULL),  
n->keepalive_intv will never re-calculate again. 

Then, it will cause to node timeout callback expired in next 10000ms.

Is it reasonable for this case?

 

Regards,

Hoang

-----Original Message-----

From: Jon Maloy <[email protected]> 

Sent: Sunday, November 18, 2018 3:52 AM

To: Hoang Huu Le <[email protected]>; 
[email protected]; [email protected]; [email protected]

Subject: RE: [tipc-discussion] [net-next] tipc: fix node keep alive interval 
calculation

 

Hi Hoang,

See below.

 

 

> -----Original Message-----

> From: Hoang Le <[email protected]>

> Sent: 15-Nov-18 04:17

> To: [email protected]; Jon Maloy

> <[email protected]>; [email protected]; [email protected]

> Subject: RE: [tipc-discussion] [net-next] tipc: fix node keep alive interval

> calculation

> 

> Hi Jon,

> 

> I think the issue now is not minor as I concluded. Support we have a huge

> cluster 75 VMS node, this issue will caused traffic performance degradation.

> So, please take time to review the patch.

> 

> Thanks,

> Hoang

> -----Original Message-----

> From: Hoang Le <[email protected]>

> Sent: Monday, November 5, 2018 3:29 PM

> To: [email protected]; [email protected];

> [email protected]; [email protected]

> Subject: [tipc-discussion] [net-next] tipc: fix node keep alive interval

> calculation

> 

> When setting LINK tolerance, node timer interval will be calculated base on

> the LINK with lowest tolerance.

> 

> But when calculated, the old node timer interval only updated if current

> setting value (tolerance/4) less than old ones regardless of number of links 
> as

> well as links' lowest tolerance value.

> 

> This caused to two cases missing if tolerance changed as following:

> Case 1:

> 1.1/ There is one link (L1) available in the system 1.2/ Set L1's tolerance 
> from

> 1500ms => lower (i.e 500ms) 1.3/ Then, fallback to default (1500ms) or higher

> (i.e 2000ms)

> 

> Expected:

>     node timer interval is 1500/4=375ms after 1.3

> 

> Result:

> node timer interval will not being updated after changing tolerance at 1.3

> since its value 1500/4=375ms is not less than 500/4=125ms at 1.2.

> 

Yes, this is not good.

 

> Case 2:

> 2.1/ There are two links (L1, L2) available in the system 2.2/ L1 and L2

> tolerance value are 2000ms as initial 2.3/ Set L2's tolerance from 2000ms =>

> lower 1500ms 2.4/ Disable link L2 (bring down its bearer)

 

If we take down the bearer L2 should be deleted, and its value ignored during 
further calculations.

 

> 

> Expected:

>     node timer interval is 2000ms/4=500ms after 2.4

> 

Similar issue as above.

 

> Result:

> node timer interval will not being updated after disabling L2 since its value

> 2000ms/4=500ms is still not less than 1500/4=375ms at 2.3 although L2 is

> already not available in the system.

> 

> To fix this, we will update node timer interval according to link tolerance

> value setting if there is only one link available and recalculate the link 
> with

> lowest tolerance for each iteration checking.

 

I think you can do this much simpler:

Just set n->keepalive_intv to a high value, e.g., 10000 ms, before you enter 
the loop.

Then, in any iteration where a lower value is calculated (i.e., inclusive the 
first one) n->keepalive_intv will be updated accordingly.

No more changes needed 😉

 

Regards

///jon

 

> 

> Signed-off-by: Hoang Le <[email protected]>

> ---

>  net/tipc/node.c | 12 ++++++++----

>  1 file changed, 8 insertions(+), 4 deletions(-)

> 

> diff --git a/net/tipc/node.c b/net/tipc/node.c index

> 2afc4f8c37a7..a0e5b2e2720d 100644

> --- a/net/tipc/node.c

> +++ b/net/tipc/node.c

> @@ -437,13 +437,14 @@ static struct tipc_node *tipc_node_create(struct

> net *net, u32 addr,

>             return n;

>  }

> 

> -static void tipc_node_calculate_timer(struct tipc_node *n, struct tipc_link

> *l)

> +static void tipc_node_calculate_timer(struct tipc_node *n, struct tipc_link

> *l,

> +                                                                bool 
> first_link)

>  {

>             unsigned long tol = tipc_link_tolerance(l);

>             unsigned long intv = ((tol / 4) > 500) ? 500 : tol / 4;

> 

>             /* Link with lowest tolerance determines timer interval */

> -           if (intv < n->keepalive_intv)

> +          if (intv < n->keepalive_intv || first_link || n->link_cnt == 1)

>                             n->keepalive_intv = intv;

> 

>             /* Ensure link's abort limit corresponds to current tolerance */ 
> @@ -

> 627,7 +628,9 @@ static void tipc_node_timeout(struct timer_list *t)

>                             if (le->link) {

>                                             spin_lock_bh(&le->lock);

>                                             /* Link tolerance may change 
> asynchronously: */

> -                                           tipc_node_calculate_timer(n, 
> le->link);

> +                                          tipc_node_calculate_timer(n, 
> le->link,

> +                                                                             
>                (bearer_id == 0) ?

> +                                                                             
>                true : false);

>                                             rc = tipc_link_timeout(le->link, 
> &xmitq);

>                                             spin_unlock_bh(&le->lock);

>                                             remains--;

> @@ -1018,7 +1021,8 @@ void tipc_node_check_dest(struct net *net, u32

> addr,

>                                             tipc_link_fsm_evt(l, 
> LINK_FAILOVER_BEGIN_EVT);

>                             le->link = l;

>                             n->link_cnt++;

> -                           tipc_node_calculate_timer(n, l);

> +                          tipc_node_calculate_timer(n, l, n->link_cnt == 1 ?

> +                                                                            
> true : false);

>                             if (n->link_cnt == 1) {

>                                             intv = jiffies + 
> msecs_to_jiffies(n->keepalive_intv);

>                                             if (!mod_timer(&n->timer, intv))

> --

> 2.17.1

> 

> 

> 

> _______________________________________________

> tipc-discussion mailing list

> [email protected]

> https://lists.sourceforge.net/lists/listinfo/tipc-discussion

 


_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to