Re: [PATCH net-next] tcp: force cwnd at least 2 in tcp_cwnd_reduction
Hi Lawrence, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on net-next/master] url: https://github.com/0day-ci/linux/commits/Lawrence-Brakmo/tcp-force-cwnd-at-least-2-in-tcp_cwnd_reduction/20180627-095533 reproduce: # apt-get install sparse make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) net/ipv4/tcp_input.c:168:42: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:168:42: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:213:21: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:213:21: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:329:19: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:329:19: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:336:19: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:337:19: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:337:19: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:347:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:347:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:412:32: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:412:32: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:413:44: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:413:44: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:436:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:436:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:464:44: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:464:44: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:473:36: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:473:36: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:475:28: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:475:28: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:492:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:492:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:496:36: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:496:36: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:509:29: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:509:29: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:511:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:511:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:512:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:513:16: sparse: expression using sizeof(void) include/net/tcp.h:739:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:652:26: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:652:26: sparse: expression using sizeof(void) include/net/tcp.h:739:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:790:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:790:33: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:794:23: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:818:17: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:818:17: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:827:9: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:827:9: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:867:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:867:16: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:902:34: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:902:34: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1654:25: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1848:17: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1849:17: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1849:17: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1869:26: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1869:26: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1896:34: sparse: expression using sizeof(void) include/net/tcp.h:1138:24: sparse: expression using sizeof(void) include/net/tcp.h:1138:24: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1995:34: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:1995:34: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2024:39: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2024:39: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2400:44: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2472:26: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2472:26: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2472:26: sparse: expression using sizeof(void) net/ipv4/tcp_input.c:2472:26:
Re: [PATCH net-next] tcp: force cwnd at least 2 in tcp_cwnd_reduction
Hi Lawrence, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on net-next/master] url: https://github.com/0day-ci/linux/commits/Lawrence-Brakmo/tcp-force-cwnd-at-least-2-in-tcp_cwnd_reduction/20180627-095533 config: x86_64-randconfig-x003-201825 (attached as .config) compiler: gcc-7 (Debian 7.3.0-16) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): In file included from include/asm-generic/bug.h:18:0, from arch/x86/include/asm/bug.h:83, from include/linux/bug.h:5, from include/linux/mmdebug.h:5, from include/linux/mm.h:9, from net/ipv4/tcp_input.c:67: net/ipv4/tcp_input.c: In function 'tcp_cwnd_reduction': include/linux/kernel.h:812:29: warning: comparison of distinct pointer types lacks a cast (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) ^ include/linux/kernel.h:826:4: note: in expansion of macro '__typecheck' (__typecheck(x, y) && __no_side_effects(x, y)) ^~~ include/linux/kernel.h:836:24: note: in expansion of macro '__safe_cmp' __builtin_choose_expr(__safe_cmp(x, y), \ ^~ include/linux/kernel.h:852:19: note: in expansion of macro '__careful_cmp' #define max(x, y) __careful_cmp(x, y, >) ^ >> net/ipv4/tcp_input.c:2480:17: note: in expansion of macro 'max' tp->snd_cwnd = max(tcp_packets_in_flight(tp) + sndcnt, 2); ^~~ vim +/max +2480 net/ipv4/tcp_input.c 2455 2456 void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int flag) 2457 { 2458 struct tcp_sock *tp = tcp_sk(sk); 2459 int sndcnt = 0; 2460 int delta = tp->snd_ssthresh - tcp_packets_in_flight(tp); 2461 2462 if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd)) 2463 return; 2464 2465 tp->prr_delivered += newly_acked_sacked; 2466 if (delta < 0) { 2467 u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered + 2468 tp->prior_cwnd - 1; 2469 sndcnt = div_u64(dividend, tp->prior_cwnd) - tp->prr_out; 2470 } else if ((flag & FLAG_RETRANS_DATA_ACKED) && 2471 !(flag & FLAG_LOST_RETRANS)) { 2472 sndcnt = min_t(int, delta, 2473 max_t(int, tp->prr_delivered - tp->prr_out, 2474 newly_acked_sacked) + 1); 2475 } else { 2476 sndcnt = min(delta, newly_acked_sacked); 2477 } 2478 /* Force a fast retransmit upon entering fast recovery */ 2479 sndcnt = max(sndcnt, (tp->prr_out ? 0 : 1)); > 2480 tp->snd_cwnd = max(tcp_packets_in_flight(tp) + sndcnt, 2); 2481 } 2482 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
[PATCH net-next] tcp: force cwnd at least 2 in tcp_cwnd_reduction
When using dctcp and doing RPCs, if the last packet of a request is ECN marked as having seen congestion (CE), the sender can decrease its cwnd to 1. As a result, it will only send one packet when a new request is sent. In some instances this results in high tail latencies. For example, in one setup there are 3 hosts sending to a 4th one, with each sender having 3 flows (1 stream, 1 1MB back-to-back RPCs and 1 10KB back-to-back RPCs). The following table shows the 99% and 99.9% latencies for both Cubic and dctcp Cubic 99% Cubic 99.9% dctcp 99%dctcp 99.9% 1MB RPCs3.5ms 6.0ms 43ms 208ms 10KB RPCs1.0ms 2.5ms 53ms 212ms On 4.11, pcap traces indicate that in some instances the 1st packet of the RPC is received but no ACK is sent before the packet is retransmitted. On 4.11 netstat shows TCP timeouts, with some of them spurious. On 4.16, we don't see retransmits in netstat but the high tail latencies are still there. Forcing cwnd to be at least 2 in tcp_cwnd_reduction fixes the problem with the high tail latencies. The latencies now look like this: dctcp 99% dctcp 99.9% 1MB RPCs 3.8ms 4.4ms 10KB RPCs 168us 211us Another group working with dctcp saw the same issue with production traffic and it was solved with this patch. The only issue is if it is safe to always use 2 or if it is better to use min(2, snd_ssthresh) (which could still trigger the problem). Signed-off-by: Lawrence Brakmo --- net/ipv4/tcp_input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 76ca88f63b70..a9255c424761 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2477,7 +2477,7 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int flag) } /* Force a fast retransmit upon entering fast recovery */ sndcnt = max(sndcnt, (tp->prr_out ? 0 : 1)); - tp->snd_cwnd = tcp_packets_in_flight(tp) + sndcnt; + tp->snd_cwnd = max(tcp_packets_in_flight(tp) + sndcnt, 2); } static inline void tcp_end_cwnd_reduction(struct sock *sk) -- 2.17.1