Hi, HAProxy 2.7.4 was released on 2023/03/10. It added 110 new commits after version 2.7.3.
The vast majority of the commits are HTTP3/QUIC updates. However, as indicated in the 2.8-dev5 announce, a concurrency bug introduced in 2.5 was fixed in this version, that may cause freezes and crashes when some HTTP/1 backend connections are closed by the server exactly at the same time they're going to be reused by another thread. Another different bug also affecting idle connections since 2.2 was fixed, possibly causing an occasional crash. One possible work-around if you've faced such issues recently is to disable inter-thread connection reuse with this directive in the global section: tune.idle-pool.shared off But beware that this may increase the total number of connections kept established with your backend servers depending the reuse frequency and the number of threads. In master-worker mode, when performing an upgrade from an old version (before 1.9) to a newer version (>=2.5) the HAPROXY_PROCESSES environment variable was missing, and this combined with a missing element in an internal structure representing old processes will result in a null-deref which will crash the master process after the reload. It's very unlikely to hit this one, except during migration attempts where it can make one think the new version doesn't work, and encourage to roll back to the older one. The reported uptime for processes was also fixed so that wall clock time is used instead of the internal timer. A few issues affecting the Lua mapping of the HTTP client were addressed; one of them is a small memory leak by which a few bytes could leak per request, which could become problematic if used heavily. Another one is a concurrency issue with Lua's garbage collector that didn't sufficiently lock other threads' items while trying to free them. A bug in the watchdog in 2.7 could occasionally make the wrong thread being measured, which could sometimes trigger it on highly asymmetric loads, such as if frontends are bound to different thread sets and one saturates the process while the other one remains fully idle. It was found that the low-latency scheduling of TLS handshakes can degenerate during extreme loads, and take a long time to recover. The problem is that in order to prevent TLS handshakes from causing high latency spikes to the rest of the traffic, they're placed in a dedicated scheduling class that executes one of them per polling loop. But if there are too many pending due to a big burst, the extra latency caused to the pending ones can make clients give up and try again, reaching the point where none of the processed tasks yields anything useful since they were already abandonned. Now the number of handshakes per loop will grow as the number of pending ones grows, and this addresses the problem without adding extra latency even under extreme loads. There were as usual a significant number of QUIC updates, aiming at addressing some issues reported by users, and to continue to improve reliability and interoperability. Among the visible ones, the client-fin timeout is now honored and allows to close faster when a last response was sent if the client disappeared. This could help reduce the number of apparent concurrent connections. In addition, some improvements were made on memory usage. Some failures to connect at high rates (such as from h2load) were finally fixed. The soft-stop is now fully functional when "tune.quic.socket-owner" is set to "connection". The old process will then continue to handle its connections and the new one will have its own connections. Low-level errors are now better handled, with some errors such as ICMP port unreachable reported by the stack causing an immediate termination of the connection since it indicates the client has closed (e.g. clicked stop in browser, or Ctrl-C in Curl). The cache failed to cache a response for a request that had the "no-cache" directive (typically a forced reload). This prevented from refreshing the cache this way, this is now fixed. An infinite loop could happen on limited listeners (rate-limited or limited by their maxconn value) due to the loss of some volatile casts during an API cleanup in 2.7. In some rare cases it was possible to freeze a compressing stream if there was exactly one byte left at the end of the buffer, which was insufficient to place a new HTX block and prevented any progress from being made. This has been the case since 2.5 so it doesn't seem easy to trigger! Layer7 retries did not work anymore on the "empty-response" condition due to a change that was made in 2.4. The dump of the supported config language keywords with -dK incorrectly attributed some of the crt-list specific keywords to "bind ... ssl", which could cause confusion for those designing config parsers or generators by regularly checking for new stuff there. Now an explicit "crt-list" sub- section is dumped and "bind ssl" only dumps keywords really supported on "bind" lines. The global directive "no numa-cpu-mapping" that forces haproxy to bind to multiple CPU sockets even if it should result in lower performance was lost across reloads in master-worker mode, because the master in wait mode doesn't see it, thus applies the restriction to itself, and that one is inherited by subsequent masters that pass it to their workers. And a few other minor updates aside, that's about all. Those with high request rates or who already noticed crashes or strange errors are strongly encouraged to update and try again. Those heavily using QUIC as well, though I suspect that many of them are often on 2.8-dev. 2.7.4 catches up with 2.8-dev on the QUIC front so if you want something more stable, that's the way to go. As mentioned in the 2.8-dev5 announce, we noticed a regression affecting all 2.7 versions. If you connect to the CLI over a UNIX socket and the client closes the input channel, the connection will not be closed. Given that the number of connections on the CLI is limited to 10 by default, it can quickly happen that the CLI becomes unusable. We'll work on it next week, but in the mean time it can be prudent to increase that limit a little bit in your global section: stats maxconn 100 # for 2.8 <= 2.8-dev5 or 2.7 <= 2.7.4 Just keep this in mind if you're thinking about upgrading from 2.6 to 2.7, better wait for 2.7.5 for the final deployment in this case. If you're already on 2.7 and did not notice anything, no need to worry. Please find the usual URLs below : Site index : https://www.haproxy.org/ Documentation : https://docs.haproxy.org/ Wiki : https://github.com/haproxy/wiki/wiki Discourse : https://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Sources : https://www.haproxy.org/download/2.7/src/ Git repository : https://git.haproxy.org/git/haproxy-2.7.git/ Git Web browsing : https://git.haproxy.org/?p=haproxy-2.7.git Changelog : https://www.haproxy.org/download/2.7/src/CHANGELOG Dataplane API : https://github.com/haproxytech/dataplaneapi/releases/latest Pending bugs : https://www.haproxy.org/l/pending-bugs Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs Code reports : https://www.haproxy.org/l/code-reports Latest builds : https://www.haproxy.org/l/dev-packages Willy --- Complete changelog : Amaury Denoyelle (29): MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set BUG/MINOR: mux-quic: transfer FIN on empty STREAM frame MINOR: h3: add traces on decode_qcs callback MINOR: quic: adjust request reject when MUX is already freed BUG/MINOR: quic: also send RESET_STREAM if MUX released BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released BUG/MINOR: h3: prevent hypothetical demux failure on int overflow MEDIUM: h3: enforce GOAWAY by resetting higher unhandled stream MINOR: mux-quic: define qc_shutdown() MINOR: mux-quic: define qc_process() MINOR: mux-quic: implement client-fin timeout MEDIUM: mux-quic: properly implement soft-stop MINOR: quic: mark quic-conn as jobs on socket allocation MEDIUM: quic: trigger fast connection closing on process stopping MEDIUM: quic: improve fatal error handling on send MINOR: quic: consider EBADF as critical on send() MINOR: quic: simplify return path in send functions MINOR: quic: implement qc_notify_send() MINOR: quic: purge txbuf before preparing new packets MEDIUM: quic: implement poller subscribe on sendto error MINOR: quic: notify on send ready BUG/MEDIUM: quic: properly handle duplicated STREAM frames BUG/MINOR: cli: fix CLI handler "set anon global-key" call BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated MINOR: h3: add traces on h3_init_uni_stream() error paths MINOR: quic: create a global list dedicated for closing QUIC conns MINOR: quic: handle new closing list in show quic MEDIUM: quic: release closing connections on stopping Aurelien DARRAGON (3): BUG/MINOR: lua/httpclient: missing free in hlua_httpclient_send() BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy BUG/MEDIUM: fd: avoid infinite loops in fd_add_to_fd_list and fd_rm_from_fd_list Christopher Faulet (13): BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was reached DOC: config: Fix description of options about HTTP connection modes DOC: config: Add the missing tune.fail-alloc option from global listing REGTESTS: Fix ssl_errors.vtc script to wait for connections close BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during parsing DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle list BUG/MINOR: mux-h1: Don't report an error on an early response close BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format body BUG/MINOR: http-check: Skip C-L header for empty body when it's not mandatory BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7 retry BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response BUG/MINOR: mxu-h1: Report a parsing error on abort with pending data Frédéric Lécaille (38): BUG/MINOR: quic: Possible unexpected counter incrementation on send*() errors MINOR: quic: Add new traces about by connection RX buffer handling MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication deadlock BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer() MINOR: quic: Simplication for qc_set_timer() MINOR: quic: Kill the connections on ICMP (port unreachable) packet receipt MINOR: quic: Add traces to qc_kill_conn() MINOR: quic: Make qc_dgrams_retransmit() return a status. BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm() MINOR: quic: Add a trace to identify connections which sent Initial packet. MINOR: quic: Add <pto_count> to the traces BUG/MINOR: quic: Do not probe with too little Initial packets BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean BUG/MINOR: quic: Do not drop too small datagrams with Initial packets BUG/MINOR: quic: Missing padding for short packets BUG/MEDIUM: quic: Missing TX buffer draining from qc_send_ppkts() BUILD: quic: 32-bits compilation issue with %zu in quic_rx_pkts_del() BUILD: thead: Fix several 32 bits compilation issues with uint64_t variables BUG/MINOR: quic: Do not send too small datagrams (with Initial packets) MINOR: quic: Add a BUG_ON_HOT() call for too small datagrams BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted BUG/MINOR: quic: v2 Initial packets decryption failed MINOR: quic: Add traces about QUIC TLS key update BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting frames BUG/MINOR: quic: Do not resend already acked frames BUG/MINOR: quic: Missing detections of amplification limit reached MINOR: quic: Send PING frames when probing Initial packet number space MINOR: quic: Do not accept wrong active_connection_id_limit values MINOR: quic: Store the next connection IDs sequence number in the connection MINOR: quic: Typo fix for ACK_ECN frame MINOR: quic: RETIRE_CONNECTION_ID frame handling (RX) MINOR: quic: Useless TLS context allocations in qc_do_rm_hp() MINOR: quic: Add spin bit support MINOR: quic: Add transport parameters to "show quic" BUG/MINOR: quic: Wrong RETIRE_CONNECTION_ID sequence number check MINOR: quic: Do not stress the peer during retransmissions of lost packets BUG/MINOR: quic: Missing listener accept queue tasklet wakeups Michael Prokop (1): DOC/CLEANUP: fix typos Remi Tricot-Le Breton (3): BUG/MINOR: cache: Cache response even if request has "no-cache" directive BUG/MINOR: cache: Check cache entry is complete in case of Vary BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback William Lallemand (8): BUG/MINOR: mworker: stop doing strtok directly from the env BUG/MEDIUM: mworker: prevent inconsistent reload when upgrading from old versions BUG/MEDIUM: mworker: don't register mworker_accept_wrapper() when master FD is wrong MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start BUG/MINOR: mworker: prevent incorrect values in uptime MINOR: ssl: rename confusing ssl_bind_kws BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value Willy Tarreau (15): BUG/MEDIUM: wdt: fix wrong thread being checked for sleeping BUG/MINOR: sched: properly report long_rq when tasks remain in the queue BUG/MEDIUM: sched: allow a bit more TASK_HEAVY to be processed when needed MINOR: mux-h2/traces: do not log h2s pointer for dummy streams MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in h2s_frt_handle_headers() BUG/MINOR: ring: do not realign ring contents on resize BUG/MINOR: fd: used the update list from the fd's group instead of tgid BUG/MEDIUM: fd: make fd_delete() support being called from a different group CLEANUP: listener: only store conn counts for local threads BUG/MAJOR: fd/thread: fix race between updates and closing FD MINOR: fd/cli: report the polling mask in "show fd" BUG/MINOR: init: properly detect NUMA bindings on large systems BUG/MINOR: thread: report thread and group counts in the correct order BUG/MAJOR: fd/threads: close a race on closing connections after takeover MINOR: quic_sock: un-statify quic_conn_sock_fd_iocb() ---