Hi,

Two days ago, I upgraded my first production system from HAProxy 1.8.19 to 2.2.9. Since then, many HTTP requests are hitting the server timeout.

Before upgrade:

    root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.5.gz | wc -l
    0
    root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.4.gz | wc -l
    0
    root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.3.gz | wc -l
    0

After upgrade:

    # Day of upgrade
    root@lb0-0:~# zgrep 'sD--' /var/log/haproxy.log.2.gz | wc -l
    3798
    # Yesterday
    root@lb0-0:~# grep 'sD--' /var/log/haproxy.log.1 | wc -l
    127176
    # Today, so far
    root@lb0-0:~# grep 'sD--' /var/log/haproxy.log | wc -l
    85063

For this specific request, Ta ("total active time for the HTTP request") is 3, and Tt ("total TCP session duration time, between the moment the proxy accepted it and the moment both ends were closed") is 300004 (5 minutes, the server timeout):

Aug 3 00:31:05 lb0-0 haproxy[16884]: $ip:62223 [03/Aug/2022:00:26:05.337] fr_other~ bk_http.lyr_http-lyr02.cf.ha.cyberfusion.cloud/http-lyr02.cf.ha.cyberfusion.cloud 0/0/0/3/300004 200 27992 - - sD-- 616/602/226/226/0 0/0 "GET https://$domain/wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/2.0"

The backend server indeed served the request within Ta:

$domain $ip - - [03/Aug/2022:00:26:05 +0200] "GET /wp-content/uploads/2022/07/20220712_155022-300x300.jpg HTTP/1.1" 200 28008 "https://$domain/stoffen/"; "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36"

The timeouts only occur with 5 out of 13 backends. There is no clear pattern, i.e. the timeouts don't come in bursts, and they aren't caused by fixed clients.

Does anyone know why the TCP session is kept open, and the HTTP request is not responded to by HAProxy after the backend server responded to the HTTP request, but only after the server timeout is reached?

--
With kind regards,

William Edwards


Reply via email to