Hi, HAProxy 2.5.2 was released on 2022/02/16. It added 44 new commits after version 2.5.1.
This version addresses a few long-term bugs that have been keeping us quite busy for far too long, but ultimately it's satisfying to know that these ones are gone and that they won't be casting a doubt over every single bug report. The main issues fixed in this version are: - a tiny race condition in the scheduler affecting the rare multi- threaded tasks. In some cases, a task could be finishing to run on one thread and expiring on another one, just in the process of being requeued to the position being in the process of being calculated by the thread finishing with it. The most likely case was the peers task disabling the expiration while waiting for other peers to be locked, causing such a non-expirable task to be queued and to block all other timers from expiring (typically health checks, peers and resolvers, but others were affected). This could only happen at high peers traffic rate but it definitely did. When built with the suitable options such as DEBUG_STRICT it would immediately crash (which is how it was detected). This bug was present since 2.0. - a bug in the Set-Cookie2 response parser may result in an infinite loop triggering the watchdog if a server sends this while it belongs to a backend configured with cookie persistence. Usually cookie-based persistence is not used with untrusted servers, but if that was the case, the following rule would be usable as a workaround for the time it takes to upgrade: http-response del-header Set-Cookie2 It reminded us that 2.5 years ago we were discussing about completely dropping Set-Cookie2 which never succeeded in field, Tim has opened an issue so that we don't forget to remove it after 2.6. This issue was diagnosed, reported and fixed by Andrew McDermott and Grant Spence. This bug was there since 1.9. - a bug in the SPOE error handling. When a connection to an agent dies, there may still be requests pending that are tied to this connection. The list of such requests is scanned so that they can be aborted, except that the condition to scan the list was incorrect, and when these requests were finally aborted upon processing timeout, they were updating the memory area they used to point to, which could have been reused for anything, causing random crashes very commonly seen in libc's malloc/free va openssl, or haproxy pools with corrupted pointers. In short, anyone using SPOE must absolutely update to apply the fix otherwise any bug they face cannot be trusted as we know there's a rare but real case of memory corruption there. This bug was present since 1.8. - there was a possible race condition on the listeners where it was sometimes possible to wake up a temporarily paused listener just after it had failed to rebind upon a failed attempt to reload. This would access fdtab[-1] causing memory corruption or crashes. It's been there since 2.2 but really started to have an effect with 2.3. - the master CLI could remain stuck forever if extra characters followed by a shutdown were sent before the end of a response. In this case, each such connection would remain unusable, and a script doing this would face a connection failure after the 10th attempt (master's maxconn). A few related issues could also cause it to loop forever (e.g. too long pipelined requests, and empty buffers after wrapping). - the connection stopping list introduced in 2.4 to deal with idle frontend connection on reloads missed a deletion, and could leave link elements in the list after their containing structure was freed, causing occasional crashes of the old process upon reload. - there is an ambiguity in the definition of dynamic table size updates between the HTTP/2 spec (RFC7540) and the HPACK spec (RFC7541) which can be read two ways. HAProxy and a few servers interpret it one way and a few clients and other servers interpret it another way (and generally clients win, as usual). One client, nghttp, enforces it strictly, causing interoperability issues with haproxy and a few other ones when the table size is set below 4096. We had a long discussion with other participants of the HTTP working group to find the best path forward that resulted in a nice update of the H2 spec that preserves the best interoperability with existing components while clarifying all points. This update is present in this version and will be progressively backported to older ones after some time (I managed to mess up with the first attempt). - the HTTP client might not always start to send requests which were ready in advance (before the connection is requested), it used to work most of the time thanks to the scheduling of events but was a bit fragile and could easily crash if the sequencing of events changed a bit. - there was an issue with the data transfer in the HTX layer, however I'm not very clear on the impact, I think it can sometimes cause data to be truncated or just blocked. - it was possible to temporarily lose the stats sockets upon reloads in master-worker mode in case of early error (e.g. missing config file), in which case the socket transfer from the older process couldn't happen. - the "set server ssl" CLI command introduced in 2.4 had the undesirable side effect of modifying the data path and the check path at the same time (by mimmicking the configration), which causes quite some trouble. Now the doc was updated to clearly state that only the data path is updated, and the code does that (otherwise it is unusable anyway). - there were still a number of other issues of lower level of importance, such as the CLI being extremely slow to parse pipelined requests because it was looking for the line feed first, hence the larger the buffer, the slower it was with batch updates like ACL/map updates; there were a few issues in the JWT code on exit (double free in deinit etc), missing headers in the HTTP client, a possibly truncated pidfile in master mode. Some debugging options were added and backported. One that recently helped us is DEBUG_POOL_INTEGRITY combined with the existing DEBUG_DONT_SHARE_POOLS and DEBUG_STRICT. The first two ones will provide sort of an equivalent of the use-after-free debug option that was not suitable for production, by checking if released memory areas were tampered with between their last free() and the next malloc(). This slightly increases CPU usage (1-2% typically) but will catch most memory corruptions much earlier and much cleaner than what happened over the last weeks: instead of crashing at random places that are victims of a change, the crash happens much closer to the bad actor, and with more context to figure what happened. And quite frankly for all those who can afford it (i.e. all those not running at more than 98% CPU), I would kindly ask to add these 4 options to their build command line so that their future bug reports are much more accurate: $ make ... DEBUG="-DDEBUG_STRICT -DDEBUG_MEMORY_POOLS \ -DDEBUG_DONT_SHARE_POOLS -DDEBUG_POOL_INTEGRITY" This also allows developers to quickly rule out many potential causes and provide responses faster. By the way we're always running with all debugging turned full-throttle on haproxy.org and recently switched from DEBUG_UAF to DEBUG_POOL_INTEGRITY. Maybe that would even be a nice improvement for distros to provide these by default starting with 2.6 or maybe even 2.5. Older stable versions are following with essentially the same set of fixes. 2.4 is in the blocks already, and 2.3+2.2+2.0 probably next week. I would like to address sincere and warm thanks to Christian Ruppert, Yves Lafon and Pierre Cheynier for having provided lots of help and traces, deployed and rebuilt almost daily with extra debug code over the last few weeks to address the painful series of issues above. Their participation was invaluable and their continued efforts finally pay for all of us. Keep up the great job guys! Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Wiki : https://github.com/haproxy/wiki/wiki Sources : http://www.haproxy.org/download/2.5/src/ Git repository : http://git.haproxy.org/git/haproxy-2.5.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy-2.5.git Changelog : http://www.haproxy.org/download/2.5/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Andrew McDermott (1): BUG/MAJOR: http/htx: prevent unbounded loop in http_manage_server_side_cookies Christopher Faulet (4): BUG/MEDIUM: htx: Adjust length to add DATA block in an empty HTX buffer BUG/MEDIUM: cli: Never wait for more data on client shutdown BUG/MINOR: httpclient: Revisit HC request and response buffers allocation BUG/MEDIUM: httpclient: Xfer the request when the stream is created David Carlier (1): BUILD/MINOR: fix solaris build with clang. Remi Tricot-Le Breton (5): REGTESTS: ssl: Fix ssl_errors regtest with OpenSSL 1.0.2 BUG/MINOR: ssl: Remove empty lines from "show ssl ocsp-response <id>" output BUG/MINOR: jwt: Double free in deinit function BUG/MINOR: jwt: Missing pkey free during cleanup BUG/MINOR: jwt: Memory leak if same key is used in multiple jwt_verify calls William Dauchy (1): BUG/MEDIUM: server: avoid changing healthcheck ctx with set server ssl William Lallemand (7): BUG/MINOR: httpclient: don't send an empty body BUG/MINOR: httpclient: set default Accept and User-Agent headers BUG/MINOR: httpclient/lua: don't pop the lua stack when getting headers DOC: management: mark "set server ssl" as deprecated BUG/MINOR: mworker: does not add the -sf in wait mode BUG/MINOR: mworker: does not erase the pidfile upon reload BUG/MINOR: httpclient/cli: display junk characters in vsn Willy Tarreau (25): BUG/MEDIUM: connection: properly leave stopping list on error MEDIUM: cli: yield between each pipelined command MINOR: channel: add new function co_getdelim() to support multiple delimiters BUG/MINOR: cli: avoid O(bufsize) parsing cost on pipelined commands MEDIUM: h2/hpack: emit a Dynamic Table Size Update after settings change BUG/MEDIUM: mcli: do not try to parse empty buffers BUG/MEDIUM: mcli: always realign wrapping buffers before parsing them BUG/MINOR: stream: make the call_rate only count the no-progress calls DEBUG: cli: add a new "debug dev fd" expert command BUILD: debug/cli: condition test of O_ASYNC to its existence DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY BUG/MEDIUM: mworker: don't lose the stats socket on failed reload BUG/MINOR: pools: always flush pools about to be destroyed DEBUG: pools: add extra sanity checks when picking objects from a local cache DEBUG: pools: let's add reverse mapping from cache heads to thread and pool DEBUG: pools: replace the link pointer with the caller's address on pool_free() BUG/MAJOR: sched: prevent rare concurrent wakeup of multi-threaded tasks DEBUG: fd: make sure we never try to insert/delete an impossible FD number MINOR: listener: replace the listener's spinlock with an rwlock BUG/MEDIUM: listener: read-lock the listener during accept() BUG/MAJOR: spoe: properly detach all agents when releasing the applet REGTESTS: server: close an occasional race on dynamic_server_ssl.vtc REGTESTS: peers: leave a bit more time to peers to synchronize BUG/MEDIUM: h2/hpack: fix emission of HPACK DTSU after settings change BUG/MINOR: mux-h2: update the session's idle delay before creating the stream ---