Hi, HAProxy 2.9-dev4 was released on 2023/08/25. It added 59 new commits after version 2.9-dev3.
Some interesting new stuff continues to arrive in this version: - maps: the set-map action (and equivalent Lua calls) used to update an entry in O(N) for N elements in a map, due to the original design focusing on limiting memory usage. Nowadays despite the warnings in the doc, it appears that more and more users are relying on set-map on the traffic path in Lua or HTTP actions and are sometimes reporting hard-to diagnose CPU usage issues which end up being caused just by this. The reference lookup code is now in O(log(N)) and the propagation in O(1) so this will be much better for these special use cases. The memory increased from 72 to 120 bytes per map entry (to which about as much was already added for each instanciation), so that's reasonable, and will later be shrunk again. In addition, lookups in empty maps used to sollicit the LRU cache anyway, which could represent up to about 4-5% CPU in tests. That's too bad, considering we know the map is empty, hence the cache as well, so now the lookup is avoided on empty maps and acls. - when idle connections were reworked to support SNI, a side effect of replacing the list with a tree was that they were now put back at the end of the list instead of at the head, hence they were all used in a round-robin fashion, something which is not that great when thinking about purging excess connections nor when trying to concentrate most of the traffic on few connections (e.g. window sizes will not necessarily increase etc). This was changed so that the previous behavior is restored and the recently used connections packed together at the head again, so that we have the hottest and most reliable ones at the head and the least trusted ones at the tail. This allows the purge mechanism to kill from the tail and preserve the most recently used ones. - limited-quic: now in order to make sure not to fool users, when building with SSL library which does NOT support QUIC (i.e. OpenSSL), "quic" bindings will be properly rejected unless the "limited-quic" is specified in the config (and it's suggested in the error message). Previously it would only silently ignore them, resulting in a non-working config, causing confusion to those who copy-paste configs without being aware of this. Also a warning is now emitted in this mode when "allow-0rtt" is specified, as the "limited-quic" compatibility layer doesn't support it. - reverse HTTP: see below for a complete description. I hope it will answer Alex's question :-) - xxhash was updated to 0.8.2 (we were on 0.8.1) because it fixes a build issue on ppc64le. - various doc/regtest/CI updates as usual. Now, regarding reverse HTTP: that's a feature that we've been repeatedly asked for over the last decade, constantly responding "not possible yet". But with the flexibility of the current architecture, it appeared that there was no more big show-stopper and it was about time to respond to this demand. What is this ? The principle is to permit a server to establish a connection to haproxy, then to switch the connection direction on both sides, so that haproxy can send requests to that server. There was a trend around this 20 years ago on HTTP/1 and it didn't work well, to be honest. And we were counting on H2 to do that because it allows to multiplex streams over a connection and to reset a stream without breaking a connection. There are 4 use cases I'm currently aware of, though others might be creative: - isolation: a server in a purely outgoing DMZ, connects to the edge load balancer and receives requests from there. There's zero incoming connection to that DMZ. Some security environments require this (not that I fully agree with this, to be honest). - work around painful NAT: mobile developers who want to test their applications on their smartphone often have to either push on a public dev server, or hack around the local network's wifi to permit their phone to connect directly to the dev PC. Here it can be much simpler, their PC connects to a public gateway, registers there and instantly receives the traffic for the configured host name and delivers it to the application running locally in debug mode with traces etc. A similar use case consists in working around the difficulty to set up port- forwarding on some home internet accesses, here you can expose your internal application directly outside via a public gateway again, using exclusively an outgoing connection. Exactly the same can be done with containers: instead of having to know what ports to NAT, it can be convenient to let the server in the container directly register to the external gateway. I'll soon try to setup one on a public server so that I can receive incoming requests on my laptop anywhere. - multi-path and high availability in complex setups: a server can register to public edge gateways via multiple paths (or even multiple internet links like one could do at home with a fibre and an xDSL backup), and the traffic will arrive via these connections. - config-less automatic webserver registration: an application server would only have to know the address of the local LB and connect there to immediately receive traffic without having to announce itself nor to rely on other discovery mechanisms. How does this work ? It's not easy to describe due to the reversal of the connection that switches roles and involves confusing terms. I'll use the term "origin" to describe the target server, "gateway" for the public node, and "visitor" to describe the person wanting to access the origin. The origin connects to the gateway over H2+TLS, presents a certificate whose CN contains the FQDN name that will be matched outside. This cert was signed by the same authority which operates the gateway so it's possible to know if this FQDN is allowed or not. The gateway receives the connection, detects it's a reversal attempt, and places this connection into a backend server's idle connections pool, associated with the host name presented by the origin. In our case, the origin also contains an haproxy node. It has a dummy listener responsible for creating idle connections to the external gateway and waiting for requests on them. Then a visitor wants to visit the site on this FQDN, connects to the gateway which has this IP address, enters a frontend which can route the request to the server which has those idle connections. If no matching connection is found, a 503 is returned, otherwise it's used and the request is sent over that connection and reaches the origin. In our case this origin is haproxy and delivers it to the local server, but we could imagine that once this becomes successful, some servers will implement it to receive the traffic directly. As a pure coincidence (really), 2 hours after we finished our first design meeting, a draft describing almost exactly the same design was sent on the IETF HTTP workgroup: https://datatracker.ietf.org/doc/draft-bt-httpbis-reverse-http/ There are small differences with our initial design but we're going to participate with the editors, sharing feedback from our implementation, adjusting it and/or the draft depending on what we'll all learn there. The goal will be to see this protocol become a standard with its own RFC, and as long as it remains a draft, our support will be experimental and subject to change to adapt to ongoing definitions. The implementation is very young for now and has quite some limitations but we preferred to expose it early so as to collect feedback. The currently known limitations are: - idle connections on the gateway will be subject to the server's purge and will regularly get killed and instantly recreated by the origin. Not dramatic but may cause many outgoing connections per day in a firewall logs. It's possible to significantly increase both sides client and server timeouts to avoid this. - the origin process will not quit on SIGUSR1 (reload) as long as it has idle connections since they're seen as idle client conns. Bah, Ctrl-C does the job for now :-) Or a client timeout as well. - for now the origin will attach all connections to the same thread. It's not the place with the most traffic so it's not urgent to address but is in the todo list. - some stats counters during the connection reversal are unreliable (some steps update the frontend and later the backend, that's a bit tricky). If you see negative connection counts or stuff like this, we're obviously interested in reports. - it has been observed that after a failed memory allocation, the listener will fail to create new connections. We also know that some parts of the syntax will be revisited (e.g. the server's dummy address, maybe even the protocol name etc). Right now an example config would look like this on the gateway: frontend pub mode http bind :443 ssl crt pub.pem use_backend be backend be mode http server srv @reverse sni req.hdr(host) frontend priv mode http bind :444 ssl crt priv.pem verify required ca-verify-file ca-auth.crt alpn h2 tcp-request session attach-srv be/srv name ssl_c_s_dn(CN) Explanation: the origin will connect to frontend "priv" and will present its certificate. Its name is extracted and the connection is offered to server "srv" of backend "be" with this name as the SNI. Then a visitor comes on frontend "pub", their request is routed to backend "be", which looks for the Host header and uses it to look for a matching idle connection. If the name matches the one previously fed and the connection is still there, the requests is routed over that connection. It's of course possible to use the same frontend with verify optional, with conditions to detect and transfer the connection etc, but it's complicated enough so I wanted to do something "simple". Now the config on the origin: listen fe mode http bind rev@be/srv maxconn 10 server srv 127.0.0.1:30080 backend be mode http server srv gateway:444 ssl crt my-origin.pem proto h2 Connections are created by fe's "bind" line which references the server. It will instruct this server to create and maintain connections until there are up to maxconn (10) available. This server is used for nothing else, but it conveys everything needed to establish an authenticated outgoing connection. Incoming requests arriving on these connections are seen as arriving in listener fe for the declared bind line, and will take their normal path (here it will be routed in clear to the local application server running on port 30080). That's it for now. If issues are met with this new mechanism (or even suggestions), please be aware that the main developer (Amaury) will be away for a few weeks, so we'll have to try to gather elements either here or in github issues so that he has the element once he's back. It would be interesting also to hear about interest from developers to implement support for this directly inside their applications or web servers. Please find the usual URLs below : Site index : https://www.haproxy.org/ Documentation : https://docs.haproxy.org/ Wiki : https://github.com/haproxy/wiki/wiki Discourse : https://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Sources : https://www.haproxy.org/download/2.9/src/ Git repository : https://git.haproxy.org/git/haproxy.git/ Git Web browsing : https://git.haproxy.org/?p=haproxy.git Changelog : https://www.haproxy.org/download/2.9/src/CHANGELOG Dataplane API : https://github.com/haproxytech/dataplaneapi/releases/latest Pending bugs : https://www.haproxy.org/l/pending-bugs Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs Code reports : https://www.haproxy.org/l/code-reports Latest builds : https://www.haproxy.org/l/dev-packages Willy --- Complete changelog : Amaury Denoyelle (25): BUILD/IMPORT: fix compilation with PLOCK_DISABLE_EBO=1 MINOR: proxy: simplify parsing 'backend/server' MINOR: connection: centralize init/deinit of backend elements MEDIUM: connection: implement passive reverse MEDIUM: h2: reverse connection after SETTINGS reception MINOR: server: define reverse-connect server MINOR: backend: only allow reuse for reverse server MINOR: tcp-act: parse 'tcp-request attach-srv' session rule REGTESTS: provide a reverse-server test MINOR: tcp-act: define optional arg name for attach-srv MINOR: connection: use attach-srv name as SNI reuse parameter on reverse REGTESTS: provide a reverse-server test with name argument MINOR: proto: define dedicated protocol for active reverse connect MINOR: connection: extend conn_reverse() for active reverse MINOR: proto_reverse_connect: parse rev@ addresses for bind MINOR: connection: prepare init code paths for active reverse MEDIUM: proto_reverse_connect: bootstrap active reverse connection MINOR: proto_reverse_connect: handle early error before reversal MEDIUM: h2: implement active connection reversal MEDIUM: h2: prevent stream opening before connection reverse completed REGTESTS: write a full reverse regtest BUG/MINOR: h2: fix reverse if no timeout defined MINOR: connection: simplify removal of idle conns from their trees MINOR: server: move idle tree insert in a dedicated function MAJOR: connection: purge idle conn by last usage Aurelien DARRAGON (6): BUG/MINOR: stktable: allow sc-set-gpt(0) from tcp-request connection BUG/MINOR: stktable: allow sc-add-gpc from tcp-request connection DEV: makefile: fix POSIX compatibility for "range" target BUG/MINOR: hlua_fcn: potentially unsafe stktable_data_ptr usage DOC: lua: fix Sphinx warning from core.get_var() DOC: lua: fix core.register_action typo Frédéric Lécaille (7): MINOR: quic+openssl_compat: Do not start without "limited-quic" MINOR: quic+openssl_compat: Emit an alert for "allow-0rtt" option MEDIUM: map/acl: Improve pat_ref_set() efficiency (for "set-map", "add-acl" action perfs) MEDIUM: map/acl: Improve pat_ref_set_elt() efficiency (for "set-map", "add-acl"action perfs) MEDIUM: map/acl: Accelerate several functions using pat_ref_elt struct ->head list MEDIUM: map/acl: Replace map/acl spin lock by a read/write lock. DOC: map/acl: Remove the comments about map/acl performance issue Ilya Shipitsin (1): CI: fedora: fix "dnf" invocation syntax Johannes Naab (1): DOC: typo: fix sc-set-gpt references Remi Tricot-Le Breton (1): DOC: jwt: Add explicit list of supported algorithms Sébastien Gross (1): DOC: Explanation of be_name and be_id fetches Tim Duesterhus (1): REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (3) William Lallemand (5): BUILD: Makefile: add the USE_QUIC option to make help BUILD: Makefile: add USE_QUIC_OPENSSL_COMPAT to make help BUILD: Makefile: realigned USE_* options in make help BUG/MINOR: quic: allow-0rtt warning must only be emitted with quic bind BUG/MINOR: quic: ssl_quic_initial_ctx() uses error count not error code Willy Tarreau (11): DEV: flags/show-sess-to-flags: properly decode fd.state SCRIPTS: git-show-backports: automatic ref and base detection with -m IMPORT: plock: also support inlining the int code IMPORT: plock: always expose the inline version of the lock wait function IMPORT: lorw: support inlining the wait call MINOR: threads: inline the wait function for pthread_rwlock emulation MINOR: atomic: make sure to always relax after a failed CAS MINOR: pools: use EBO to wait for unlock during pool_flush() MINOR: pattern: do not needlessly lookup the LRU cache for empty lists IMPORT: xxhash: update xxHash to version 0.8.2 BUG/MINOR: ssl_sock: fix possible memory leak on OOM ---