Hi Jon,

I combine benchmark test with 50 connections and ping cmd from two nodes.
You can compare results from original code, your fix and Tung's fix as 
following:

Original code:
node1 ~ # ping -s 1400 10.0.0.2 -c 300
PING 10.0.0.2 (10.0.0.2): 1400 data bytes
1408 bytes from 10.0.0.2: seq=22 ttl=64 time=1.337 ms
1408 bytes from 10.0.0.2: seq=24 ttl=64 time=1.208 ms
1408 bytes from 10.0.0.2: seq=25 ttl=64 time=1.145 ms
1408 bytes from 10.0.0.2: seq=76 ttl=64 time=1.145 ms
1408 bytes from 10.0.0.2: seq=78 ttl=64 time=1.449 ms
1408 bytes from 10.0.0.2: seq=130 ttl=64 time=1.230 ms
1408 bytes from 10.0.0.2: seq=134 ttl=64 time=1.020 ms
1408 bytes from 10.0.0.2: seq=185 ttl=64 time=1.743 ms
1408 bytes from 10.0.0.2: seq=186 ttl=64 time=1.502 ms
1408 bytes from 10.0.0.2: seq=187 ttl=64 time=1.289 ms
1408 bytes from 10.0.0.2: seq=189 ttl=64 time=1.306 ms
1408 bytes from 10.0.0.2: seq=239 ttl=64 time=1.254 ms
1408 bytes from 10.0.0.2: seq=241 ttl=64 time=1.114 ms
1408 bytes from 10.0.0.2: seq=242 ttl=64 time=1.058 ms

--- 10.0.0.2 ping statistics ---
301 packets transmitted, 301 packets received, 0% packet loss
round-trip min/avg/max = 0.077/0.361/1.743 ms

- JON's fix
node1 ~ # ping -s 1400 10.0.0.2 -c 300
1408 bytes from 10.0.0.2: seq=22 ttl=64 time=1.013 ms
1408 bytes from 10.0.0.2: seq=87 ttl=64 time=2.468 ms

--- 10.0.0.2 ping statistics ---
300 packets transmitted, 300 packets received, 0% packet loss
round-trip min/avg/max = 0.119/0.323/2.468 ms
node1 ~ #

- Tung's fix
node1 ~ # ping -s 1400 10.0.0.2 -c 300
--- 10.0.0.2 ping statistics ---
300 packets transmitted, 300 packets received, 0% packet loss
round-trip min/avg/max = 0.101/0.303/0.864 ms

>From ping statistics, I could see your solution starved twice and maximum time 
>is 2.468 ms.
Then, we're not completely solve the issue yet. But test results from Tung's 
fix, I don't see a starvation happen.  
So, I think we can go ahead with Tung's code fixed. Please give me your idea.

Regards,
Hoang
-----Original Message-----
From: tung quang nguyen <[email protected]> 
Sent: Thursday, July 25, 2019 5:50 PM
To: 'Jon Maloy' <[email protected]>; 'Jon Maloy' <[email protected]>; 
[email protected];
[email protected]
Subject: Re: [tipc-discussion] [net-next v2 1/1] tipc: reduce risk of wakeup 
queue starvation

Hi Jon,

Let's go for this way for now.

Thanks.

Best regards,
Tung Nguyen

-----Original Message-----
From: Jon Maloy <[email protected]> 
Sent: Friday, July 19, 2019 10:06 AM
To: Jon Maloy <[email protected]>; Jon Maloy <[email protected]>
Cc: [email protected];
[email protected]; [email protected];
[email protected]; [email protected];
[email protected]; [email protected];
[email protected]; [email protected]
Subject: [net-next v2 1/1] tipc: reduce risk of wakeup queue starvation

In commit 365ad353c256 ("tipc: reduce risk of user starvation during
link congestion") we allowed senders to add exactly one list of extra
buffers to the link backlog queues during link congestion (aka
"oversubscription"). However, the criteria for when to stop adding
wakeup messages to the input queue when the overload abates is
inaccurate, and may cause starvation problems during very high load.

Currently, we stop adding wakeup messages after 10 total failed attempts
where we find that there is no space left in the backlog queue for a
certain importance level. The counter for this is accumulated across all
levels, which may lead the algorithm to leave the loop prematurely,
although there may still be plenty of space available at some levels.
The result is sometimes that messages near the wakeup queue tail are not
added to the input queue as they should be.

We now introduce a more exact algorithm, where we keep adding wakeup
messages to a level as long as the backlog queue has free slots for
the corresponding level, and stop at the moment there are no more such
slots or when there are no more wakeup messages to dequeue.

Fixes: 365ad35 ("tipc: reduce risk of user starvation during link
congestion")
Reported-by: Tung Nguyen <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
---
 net/tipc/link.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/net/tipc/link.c b/net/tipc/link.c
index 66d3a07..f1d2732 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -853,18 +853,31 @@ static int link_schedule_user(struct tipc_link *l,
struct tipc_msg *hdr)
  */
 static void link_prepare_wakeup(struct tipc_link *l)
 {
+       struct sk_buff_head *wakeupq = &l->wakeupq;
+       struct sk_buff_head *inputq = l->inputq;
        struct sk_buff *skb, *tmp;
-       int imp, i = 0;
+       struct sk_buff_head tmpq;
+       int avail[5] = {0,};
+       int imp = 0;
+
+       __skb_queue_head_init(&tmpq);
 
-       skb_queue_walk_safe(&l->wakeupq, skb, tmp) {
+       for (; imp <= TIPC_SYSTEM_IMPORTANCE; imp++)
+               avail[imp] = l->backlog[imp].limit - l->backlog[imp].len;
+
+       skb_queue_walk_safe(wakeupq, skb, tmp) {
                imp = TIPC_SKB_CB(skb)->chain_imp;
-               if (l->backlog[imp].len < l->backlog[imp].limit) {
-                       skb_unlink(skb, &l->wakeupq);
-                       skb_queue_tail(l->inputq, skb);
-               } else if (i++ > 10) {
-                       break;
-               }
+               if (avail[imp] <= 0)
+                       continue;
+               avail[imp]--;
+               __skb_unlink(skb, wakeupq);
+               __skb_queue_tail(&tmpq, skb);
        }
+
+       spin_lock_bh(&inputq->lock);
+       skb_queue_splice_tail(&tmpq, inputq);
+       spin_unlock_bh(&inputq->lock);
+
 }
 
 void tipc_link_reset(struct tipc_link *l)
-- 
2.1.4




_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion



_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to