Hi, Since we don't really know how to track this one, we thought it might be better to reach out here to get feedback.
We're using haproxy to deliver streaming files under pressure (80-90Gbps per machine). When using h1/http, splice-response is a great help to keep load under control. We use branch v2.9 at the moment. However, we've hit a bug with splice-response (Github issue created) and we had to use all day our haproxies without splicing. When we reach a certain load, a "connection refused" alarm starting buzzing like crazy (2-3 times every 30 minutes). This alarm is simply a connect to localhost with 500ms timeout: socat /dev/null tcp4-connect:127.0.0.1:80,connect-timeout=0.5 The log file indicates the port is virtually closed: 2024/03/27 01:06:04 socat[984480] E read(6, 0xe98000, 8192): Connection refused The thing is haproxy process is very much alive...so we just restart it everytime this happens. What data do you suggest we collect to help track this down? Not sure if the stats socket is available, but we can definitely try and get some information. We're not running out of fds, or even connections with/without backlog (we have a global maxconn of 900000 with ~30,000 streaming sessions active and we have tcp_max_syn_backlog set to 262144), we checked. But it seems to correlate with heavy traffic. Thanks! -- Felipe Damasio