Hey, I noticed, that I am experiencing a strange issue with https requests (both on http/1.1 and http/2):
It seems like around 1 of 500 / 1 of 1000 requests gets delayed by around 60 to 90 Seconds between the Client and HAProxy. All other requests work fine and are blazingly fast. What the client application logs: 17:59:20.488006 Starting REST request 17:59:40.492774 REST request: PUT https://mydomain.com:443/super_fancy_url/_doc/1313409683%3A58505246%3A2023-05-30T15%3A59%3A20Z?refresh=false returned 0 and took 19999.01ms(name_lookup_time: 0.09ms, connect_time: 0.09ms) This error happens, because the client does not receive a http response within the 20 seconds configured timeout. When looking through the HAProxy logs, I find a log line for this request, but for whatever reason, the time logged there does not match the request time: May 30 18:00:40 haproxy[458192]: 10.152.40.11:42054 [30/May/2023:18:00:40.964] HTTP_MYDOMAIN~ HTTP_MYDOMAIN/super-server.mydomain.com 0/0/0/9/9 201 494 - - ---- 32852/116/0/0/0 0/0 "PUT /super_fancy_url/_doc/1313409683%3A58505246%3A2023-05-30T15%3A59%3A20Z?refresh=false HTTP/1.1" I double-checked the system clocks on the client and the HAProxy node and can ensure, that they are in sync. As the traffic is SSL encrypted, I don’t think, I can do a useful tcp-dump here. Interesting is, that the application claims to be able to connect on the TCP-Level to HAProxy in less than a millisecond. (That might be, as we have sub-millisecond latency in our network), but it seems to really be the HTTP-Request that is delayed. But HAProxy reports, that the connection to / response from the backend server took only 9ms. To me, that does not explain an exceeded timeout of >20 seconds. When looking at examples (same source and destination), where the request was not delayed / timing out, then the times logged by HAproxy were correct too. The hopefully important bits from my config: global # Send logs to syslog log 10.12.244.22:1514 local0 notice log 10.12.244.22:1514 local1 info maxconn 100000 ulimit-n 655360 nbthread 64 external-check spread-checks 10 insecure-fork-wanted master-worker # Hard stop old workers after some time to prevent https://github.com/haproxy/haproxy/issues/1701 hard-stop-after 576h # 24 days stats socket /var/run/haproxy/admin.sock mode 666 expose-fd listeners level admin stats socket /var/run/haproxy/haproxy.sock expose-fd listeners mode 666 stats socket /var/run/haproxy.sock expose-fd listeners mode 666 level admin stats timeout 30s user haproxy group haproxy daemon # Generated by https://ssl-config.mozilla.org/#server=haproxy&server-version=1.8.8&config=intermediate ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384 ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets ssl-default-server-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384 ssl-default-server-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets tune.ssl.default-dh-param 2048 tune.bufsize 524288 defaults log global option dontlognull option redispatch option log-health-checks timeout connect 1s timeout client 600s timeout server 600s timeout client-fin 5s timeout server-fin 5s timeout check 250 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http default-server maxconn 100000 option allbackups listen HTTP_MYDOMAIN mode http bind 10.200.4.198:80 bind 10.200.4.198:443 ssl crt /etc/ssl/haproxy/ option httpchk GET /_cluster/health?local=true option httplog option forwardfor header X-Real-Ip http-request replace-value Upgrade (.*) websocket # https://bishopfox.com/blog/h2c-smuggling-request http-request del-header X-Forwarded-For http-request set-header X-Forwarded-Port 443 if { ssl_fc } http-request set-header X-Forwarded-Port 80 if !{ ssl_fc } http-request del-header X-Forwarded-Proto http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } timeout check 1500ms default-server inter 2s http-response set-header LB-FQDN "haproxy-1.mydomain.com" http-response set-header LB-Backend-Server %s # https://serverfault.com/questions/650588/haproxy-timing-connection-diagram http-response set-header LB-Times "connect: %Th ms; queue: %Tw ms; be-connect: %Tc ms; be-response: %Tr ms" http-request redirect scheme https code 301 unless { ssl_fc } server super-server.mydomain.com 10.16.32.134:9200 check server super-server-2.mydomain.com 10.16.32.135:9200 check server super-server-3.mydomain.com 10.16.32.136:9200 check and my haproxy version: HAProxy version 2.6.13-1ppa1~focal 2023/05/06 - https://haproxy.org/ Status: long-term supported branch - will stop receiving fixes around Q2 2027. Known bugs: http://www.haproxy.org/bugs/bugs-2.6.13.html Running on: Linux 5.4.0-94-generic #106-Ubuntu SMP Thu Jan 6 23:58:14 UTC 2022 x86_64 Build options : TARGET = linux-glibc CPU = generic CC = cc CFLAGS = -O2 -g -O2 -fdebug-prefix-map=/build/haproxy-miXGSs/haproxy-2.6.13=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_OPENSSL=1 USE_LUA=1 USE_SLZ=1 USE_SYSTEMD=1 USE_PROMEX=1 DEBUG = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY +LUA -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL +PROMEX -QUIC +RT +SLZ -STATIC_PCRE -STATIC_PCRE2 +SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB Default settings : bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 Built with multi-threading support (MAX_THREADS=64, default=64). Built with OpenSSL version : OpenSSL 1.1.1f 31 Mar 2020 Running on OpenSSL version : OpenSSL 1.1.1f 31 Mar 2020 OpenSSL library supports TLS extensions : yes OpenSSL library supports SNI : yes OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3 Built with Lua version : Lua 5.3.3 Built with the Prometheus exporter as a service Built with network namespace support. Support for malloc_trim() is enabled. Built with libslz for stateless compression. Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip") Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND Built with PCRE2 version : 10.34 2019-11-21 PCRE2 library supports JIT : yes Encrypted password support via crypt(3): yes Built with gcc compiler version 9.4.0 Available polling systems : epoll : pref=300, test result OK poll : pref=200, test result OK select : pref=150, test result OK Total: 3 (3 usable), will use epoll. Available multiplexer protocols : (protocols marked as <default> cannot be specified using 'proto' keyword) h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG <default> : mode=HTTP side=FE|BE mux=H1 flags=HTX h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG <default> : mode=TCP side=FE|BE mux=PASS flags= none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG Available services : prometheus-exporter Available filters : [CACHE] cache [COMP] compression [FCGI] fcgi-app [SPOE] spoe [TRACE] trace I don’t see any errors being logged into systemd. Also no failed health checks (where I would understand broken connections). What I already tried / where I initially come from: Initially, the listener looked like this: “ssl alpn h2,http/1.1 crt /etc/ssl/haproxy/” and I was running HAProxy version “2.6.8-2ppa1~focal”. There I first noticed this behavior. In order to debug, I then disabled http/2 and upgraded to a newer HAProxy version. But this sadly did not have any impact. Ah an in case you are wondering: I am running HAProxy on 4 Bare-Metal servers (that all show this behavior). They all run Ubuntu 20 and openssl version 1.1.1f. I ran “show sess” in an endless loop and it looks like, I only see sessions when a timeout happens. (Might be, because the connections are closed after every request): 0x7f8b38499130: proto=tcpv4 src=10.136.119.140:56940 fe=HTTP_MYDOMAIN be=HTTP_MYDOMAIN srv=super-server.mydomain.com ts=00 epoch=0x177 age=4s calls=1 rate=0 cpu=0 lat=0 rq[f=49c40080h,i=0,an=8000h,rx=,wx=,ax=] rp[f=80400000h,i=0,an=1400000h,rx=9m59s,wx=,ax=] scf=[8,80h,fd=3824] scb=[8,11h,fd=4361] exp=9m59s rc=0 c_exp= But I never see any session being listed for more than 1-2 seconds, which confuses me, as I would expect the Any ideas? Do you need any dumps from my side for more/better insights? Cheers Marno