Re: haproxy does not correctly handle MSS on Freebsd
Thank you, Lukas. I would investigate it a bit more. Simon 20160821
Re: haproxy does not correctly handle MSS on Freebsd
Hi Simon, Packets's segment should be 16344 as the advertised value. Wrong. The negotiated value is a maximum (the M in MSS means maximum), not a guaranteed value. There is nothing wrong with TCP segments below the MSS. Whether the stack is segmenting at MSS size depends on a lot of things and changes from use-case to use-case, application to application. Haproxy is highly optimized for efficiency. That is why you can reach 40Gbps and more with haproxy. It will behave differently than other applications, yes, but that doesn't mean it behavior is irregular. I saw other applicaption worked as expected. Of course - each application behaves different. Other applications may buffer data aggressively, shoving more data into the egress socket at once, so the kernel can segment at a higher TCP segment size. That doesn't mean the behavior of haproxy is not correct. 3. MSS option is invalid on FreeBSD. Again, can you elaborate? What does "invalid" mean? I have tested it with MSS 1200 and found haproxy advertised value have not changed. The value is equal to client's advertised value, eg. 1460. Haproxy doesn't advertise anything, your OS does. If this is true, then it would be a FreeBSD bug, not a haproxy one. I do have my doubts about this though. Do you have a tcpdump to show this behavior? Again, what you believe is a misbehavior very likely isn't, you would be better up to troubleshoot your actual issue with us. Lukas
Re: haproxy does not correctly handle MSS on Freebsd
Hi Lukas, Hi Simon, Am 19.08.2016 um 12:41 schrieb k simon: Hi,List: Haproxy's throughput is much less than nginx or squid on FreeBSD and it's high cpu usage often. When I investigate it a bit more, I found haproxy does not correctly handle MSS on FreeBSD. Your kernel decides the segment size of a TCP packet. An TCP application such as haproxy can give hints, like limiting the MSS further, but it definitely does not segment TCP payload. I think your investigation goes in the wrong direction ... 1. When haproxy bind to a physical interface and change net.inet.tcp.mssdflt to a large value. Haproxy will use this value as effective size of outgoing segments and ignore the advertised value. Do you have tcpdump to show that? If your TCP segments are larger than the negotiated MSS, then its a freebsd kernel bug, not a haproxy one. Its not the applications job to segment TCP packets. 2.When haproxy bind to a loopback interface. The advertised value is 16344, it's correct. But haproxy send irregular segment size. What's irregular? In your loopback tcpdump capture, I don't see any packets with a segment size larger than 16344, so no irregularity there. Packets's segment should be 16344 as the advertised value. I saw other applicaption worked as expected. 3. MSS option is invalid on FreeBSD. Again, can you elaborate? What does "invalid" mean? I have tested it with MSS 1200 and found haproxy advertised value have not changed. The value is equal to client's advertised value, eg. 1460. When path_mtu_discovery=1, it worked as expected. Haproxy is not aware of this parameter. Your kernel is. Is your CPU usage problem gone with this setting, or do your just don't see any "MSS irregularities" anymore? Please do elaborate what you think its wrong with haproxy behavior *exactly*, because just saying "invalid/irregular MSS behavior" without specifying what exactly you mean isn't helpful. Lukas
Re: haproxy does not correctly handle MSS on Freebsd
Hi Simon, Am 19.08.2016 um 12:41 schrieb k simon: Hi,List: Haproxy's throughput is much less than nginx or squid on FreeBSD and it's high cpu usage often. When I investigate it a bit more, I found haproxy does not correctly handle MSS on FreeBSD. Your kernel decides the segment size of a TCP packet. An TCP application such as haproxy can give hints, like limiting the MSS further, but it definitely does not segment TCP payload. I think your investigation goes in the wrong direction ... 1. When haproxy bind to a physical interface and change net.inet.tcp.mssdflt to a large value. Haproxy will use this value as effective size of outgoing segments and ignore the advertised value. Do you have tcpdump to show that? If your TCP segments are larger than the negotiated MSS, then its a freebsd kernel bug, not a haproxy one. Its not the applications job to segment TCP packets. 2.When haproxy bind to a loopback interface. The advertised value is 16344, it's correct. But haproxy send irregular segment size. What's irregular? In your loopback tcpdump capture, I don't see any packets with a segment size larger than 16344, so no irregularity there. 3. MSS option is invalid on FreeBSD. Again, can you elaborate? What does "invalid" mean? When path_mtu_discovery=1, it worked as expected. Haproxy is not aware of this parameter. Your kernel is. Is your CPU usage problem gone with this setting, or do your just don't see any "MSS irregularities" anymore? Please do elaborate what you think its wrong with haproxy behavior *exactly*, because just saying "invalid/irregular MSS behavior" without specifying what exactly you mean isn't helpful. Lukas
Re: haproxy does not correctly handle MSS on Freebsd
Hi,List: Haproxy's throughput is much less than nginx or squid on FreeBSD and it's high cpu usage often. When I investigate it a bit more, I found haproxy does not correctly handle MSS on FreeBSD. 1. When haproxy bind to a physical interface and change net.inet.tcp.mssdflt to a large value. Haproxy will use this value as effective size of outgoing segments and ignore the advertised value. When path_mtu_discovery=1, it worked as expected. 2.When haproxy bind to a loopback interface. The advertised value is 16344, it's correct. But haproxy send irregular segment size. Whenerver path_mtu_discovery set to 0 or 1, it worked weird . 3. MSS option is invalid on FreeBSD. I'm running haproxy instance inside a vimage jail, and it should act the same as runing on bare box. It's really a serious problem and easily to reproduced. Regards Simon P.S. 1. FreeBSD ha-l0-j2 10.3-STABLE FreeBSD 10.3-STABLE #0 r303988: Fri Aug 12 16:48:21 CST 2016 root@cache-farm-n2:/usr/obj/usr/src/sys/10-stable-r303988 amd64 2. HA-Proxy version 1.6.8 2016/08/14 Copyright 2000-2016 Willy TarreauBuild options : TARGET = freebsd CPU = generic CC = clang37 CFLAGS = -O2 -pipe -fno-omit-frame-pointer -march=corei7 -fno-strict-aliasing -DFREEBSD_PORTS OPTIONS = USE_TPROXY=1 USE_GETADDRINFO=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_REGPARM=1 USE_OPENSSL=1 USE_STATIC_PCRE=1 USE_PCRE_JIT=1 Default settings : maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Encrypted password support via crypt(3): yes Built with zlib version : 1.2.8 Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with OpenSSL version : OpenSSL 1.0.2h 3 May 2016 Running on OpenSSL version : OpenSSL 1.0.2h 3 May 2016 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports prefer-server-ciphers : yes Built with PCRE version : 8.39 2016-06-14 PCRE library supports JIT : yes Built without Lua support Built with transparent proxy support using: IP_BINDANY IPV6_BINDANY Available polling systems : kqueue : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use kqueue. 3. frontend tcp-in mode tcp bind :1301 frontend virtual-frontend mode http bind 127.0.0.1:1000 accept-proxy 4. 17:41:36.515924 IP 127.0.0.1.12558 > 127.0.0.1.1000: Flags [S], seq 1769628266, win 65535, options [mss 16344], length 0 17:41:36.515954 IP 127.0.0.1.1000 > 127.0.0.1.12558: Flags [S.], seq 360367860, ack 1769628267, win 65535, options [mss 16344], length 0 17:41:36.515957 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 773322:777418, ack 211, win 65535, length 4096 17:41:36.515985 IP 127.0.0.1.12558 > 127.0.0.1.1000: Flags [.], ack 1, win 65535, length 0 17:41:36.515994 IP 127.0.0.1.12558 > 127.0.0.1.1000: Flags [P.], seq 1:49, ack 1, win 65535, length 48 17:41:36.516001 IP 127.0.0.1.12558 > 127.0.0.1.1000: Flags [P.], seq 49:914, ack 1, win 65535, length 865 17:41:36.516085 IP 127.0.0.1.1000 > 127.0.0.1.12558: Flags [.], ack 914, win 65535, length 0 17:41:36.516095 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 777418:778878, ack 211, win 65535, length 1460 17:41:36.516203 IP 127.0.0.1.12522 > 127.0.0.1.1000: Flags [.], ack 778878, win 65535, length 0 17:41:36.516403 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 778878:784978, ack 211, win 65535, length 6100 17:41:36.516424 IP 127.0.0.1.12556 > 127.0.0.1.1000: Flags [F.], seq 477, ack 274, win 65535, length 0 17:41:36.516435 IP 127.0.0.1.1000 > 127.0.0.1.12556: Flags [.], ack 478, win 65535, length 0 17:41:36.516466 IP 127.0.0.1.1000 > 127.0.0.1.12556: Flags [F.], seq 274, ack 478, win 65535, length 0 17:41:36.516487 IP 127.0.0.1.12556 > 127.0.0.1.1000: Flags [.], ack 275, win 65534, length 0 17:41:36.516515 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 784978:789074, ack 211, win 65535, length 4096 17:41:36.516532 IP 127.0.0.1.12522 > 127.0.0.1.1000: Flags [.], ack 789074, win 65535, length 0 17:41:36.516922 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 789074:790534, ack 211, win 65535, length 1460 17:41:36.516960 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 790534:793170, ack 211, win 65535, length 2636 17:41:36.516971 IP 127.0.0.1.12522 > 127.0.0.1.1000: Flags [.], ack 793170, win 65535, length 0 17:41:36.517270 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 793170:796942, ack 211, win 65535, length 3772 17:41:36.517351 IP 127.0.0.1.1000 > 127.0.0.1.12522: Flags [P.], seq 796942:798402, ack 211, win 65535, length 1460 17:41:36.517368 IP 127.0.0.1.12522 > 127.0.0.1.1000: Flags [.], ack 798402, win 65535, length 0 17:41:36.517529 IP 127.0.0.1.1000 > 127.0.0.1.12405: Flags [P.], seq 482640:483712, ack 401, win 65535, length 1072 17:41:36.517536 IP 127.0.0.1.12405 > 127.0.0.1.1000: Flags