Hello everyone,
I'm currently troubleshooting intermittent 520 errors returned by
Cloudflare in front of an HAProxy-based load balancing infrastructure.
These errors are rare—affecting about 0.001% of requests—which makes them
especially tricky to diagnose. According to Cloudflare, a 520 indicates the
origin server (HAProxy, in this case) unexpectedly closed the connection or
returned an empty response.
Key Observations
- it happens on several frontends and on different haproxy instances.
- The issue only occurs when the client request protocol is HTTP/2 (http/3
is disabled on our side at the moment).
- Disabling Cloudflare's "HTTP/2 to origin" option completely eliminates
the issue.
This strongly suggests the problem lies in the interaction between
Cloudflare's HTTP/2 implementation and HAProxy's HTTP/2 support.
Cloudflare Logs for 520 Events
"OriginSSLProtocol": "unknown",
"OriginTCPHandshakeDurationMs": 11,
"OriginTLSHandshakeDurationMs": 12,
"OriginRequestHeaderSendDurationMs": 0,
"OriginResponseDurationMs": 0,
"ClientSSLCipher": "AEAD-AES128-GCM-SHA256",
"ClientSSLProtocol": "TLSv1.3"
To me, this implies:
- Cloudflare successfully opens a TCP connection to HAProxy
(OriginTCPHandshakeDurationMs is non-zero),
- But fails during the TLS handshake, as evidenced by "OriginSSLProtocol":
"unknown".
HAProxy version 2.8.15 (issue also present on 2.8.11)
Relevant global settings:
ssl-default-bind-ciphersuites
TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-client-sigalgs
rsa_pkcs1_sha256:rsa_pkcs1_sha384:rsa_pkcs1_sha512:ecdsa_secp256r1_sha256:ecdsa_secp384r1_sha384:ecdsa_secp521r1_sha512:rsa_pss_rsae_sha256:rsa_pss_rsae_sha384:rsa_pss_rsae_sha512:ed25519:ed448:rsa_pss_pss_sha256:rsa_pss_pss_sha384:rsa_pss_pss_sha512
option tcp-smart-accept
option tcp-smart-connect
Frontend configuration (abridged):
mode http
bind xxxxx:443 ssl crt-list /folder/redacted.txt ssl-min-ver TLSv1.2
tls-ticket-keys /var/run/tls_ticket_keys
I've tried enabling debug-style logging using:
option log-separate-errors
option logasap
error-log-format '%ci:%cp -> %fi:%fp %b/%s | Tq=%Tq Tw=%Tw Tc=%Tc
Tr=%Tr Ta=%Ta Tt=%Tt TR=%TR HTTP=%ST (%hrl) B=%B | TLS=%sslv/%sslc
SNI=%[ssl_fc_sni] %HM %HP'
Unfortunately, these logs don't show any detail about the failed TLS
handshakes.
Questions
- have anyone experienced this issue ? If you have a small amount of 520 it
may not been printed on the interface, but it is there. Just go the
following URL :
https://dash.cloudflare.com/[accountID]/mydomain.com/analytics/traffic?status-code=520
- Are there any known issues with HAProxy and HTTP/2 in TLS termination
scenarios ?
- Is there any way to increase logging detail specifically for failed TLS
handshakes? tcpdump is a no go in this case due to the heavy trafic and the
minimal occurrence of the issue.
Because the issue is so rare (under 100 requests/day), every test takes
time to yield results. Any insights or ideas would be greatly appreciated.
Thanks in advance!
Olivier