Hi Luke,

On Mon, Feb 16, 2026 at 10:56:32PM +0700, Luke Seelenbinder wrote:
> Hi List,
> 
> After upgrading from 3.2.11 to 3.2.12, we're seeing TCP connections from
> HAProxy's DNS resolver to nameservers accumulate in CLOSE_WAIT state
> indefinitely. The nameservers send their FIN, but HAProxy never completes
> the teardown on its side. The leak is continuous -- we observed growth from
> ~1K to ~5K total connections in about 4 hours.
> 
> The issue appeared immediately after upgrading with no configuration
> changes. It does not occur on 3.2.11.
> 
> We use TCP resolvers (tcp6@ prefix) with SRV record resolution via
> server-template. The relevant config:
> 
>   resolvers default
>     nameserver g1       tcp6@[2001:4860:4860::8888]:53 source [<ipv6>]
>     nameserver g2       tcp6@[2001:4860:4860::8844]:53 source [<ipv6>]
>     nameserver opendns  tcp6@[2620:0:ccc::2]:53        source [<ipv6>]
>     accepted_payload_size 8192
>     resolve_retries 4
>     hold valid      60s
>     hold obsolete   30s
>     hold timeout    300s
>     timeout resolve 20s
>     timeout retry   1s
> 
>   defaults
>     default-server resolvers default inter 5s [...]
>     option abortonclose
> 
>   backend example
>     server-template myserver 2 ipv6@_svc._tcp.example.com [...]
> 
> We have several backends using this pattern with SRV records.
> 
> The CLOSE_WAIT connections are exclusively to port 53 on the configured
> nameservers:
> 
>   $ ss -tn state close-wait | awk '{print $5}' | \
>       sed 's/:[0-9]*$//' | sort | uniq -c | sort -rn
>      1100 [2620:0:ccc::2]
>      1097 [2001:4860:4860::8844]
>      1092 [2001:4860:4860::8888]
> 
>   $ ss -tn state close-wait -o | head -5
>   Recv-Q Send-Q  Local Address:Port              Peer Address:Port
>   0      0       [2600:3c03::xxxx]:49570  [2001:4860:4860::8844]:53
>   0      0       [2600:3c03::xxxx]:40414  [2001:4860:4860::8888]:53
>   0      0       [2600:3c03::xxxx]:38996  [2001:4860:4860::8888]:53
>   0      0       [2600:3c03::xxxx]:53882         [2620:0:ccc::2]:53
> 
> Meanwhile actual client and backend connections are healthy without leakage.
> 
> Two commits in 3.2.12 seem like potential candidates since DNS resolver
> TCP connections go through the raw socket path:
> 
>   - rawsock: introduce CO_RFL_TRY_HARDER to detect closures on complete
>     reads (Willy)
>   - ssl: don't always process pending handshakes on closed connections
>     (Willy)
> 
> Note that "option abortonclose" is enabled in our defaults, which the
> second commit explicitly interacts with.
> 
> We've confirmed reverting to 3.2.11 also resolves it,
> though we'd prefer to stay on 3.2.12 for the QUIC CVE fixes.
> 
> Happy to provide haproxy -vv output, full config, or any additional
> debugging if helpful.

Thanks for all the details! In parallel we've received a private report
of peers losing synchronization after some time, and I'm starting to
wonder if the common point between the two couldn't be related to the
fact that both are an applet, given that we had two important fixes
related to that between the two versions:

    - BUG/MAJOR: applet: Don't call I/O handler if the applet was shut
    - BUG/MEDIUM: applet: Fix test on shut flags for legacy applets

Since it seems to happen quite fast in your case, I don't know if you'd
be willing to try to bisect or just revert one or a few patches to help
us spot the culprit. This would be super helpful, and we could emit
another version relatively quickly once the problem is understood.

Thanks,
Willy


Reply via email to