Hi,
HAProxy 3.1-dev9 was released on 2024/10/03. It added 67 new commits
after version 3.1-dev8.
Another batch of old bugs was addressed. The queue saga continues, with
some cases that could trigger an infinite loop when shutting down server
sessions. I'm waiting for confirmation from two distinct deployments to
conclude that it's now behind us. Other interesting issues such as a case
where a CLI connection could remain there forever, and various issues
related to reporting end of stream in certain cases of zero-copy were
addressed as well.
Let's talk about the more interesting stuff that was added:
- as previously discussed, the naming conflicts between various families
of proxies (frontend/listen/backend/defaults/log-forward etc) and
between servers are now properly detected and will report a deprecation
warning, indicating that 3.3 will put an end to that practice. Doing so
allowed to discover some cases which were not properly handled already,
and revealed that the current situation that tries to address subtle
conditions is a real mess that needs to go.
- thanks to the work above, we figured that duplicated defaults sections
that were kept in memory cannot be explicitly referenced later and can
safely be dropped. This also quite a bit of memory in configs which use
a lot of defaults (as customer delimiters to reset settings for example).
Similarly the duplicate server detection that used to be in O(N^2) is
now O(N), so a config with 2000 servers in a backend now loads in 1/4 of
the previous time. Fortunately not everyone uses such configs, but I've
already seen much worse so I'm sure this exists :-)
- a simplification made to ease error reporting and consisting in keeping
a central list of opened file names showed the rewarding benefit of
slightly speeding up parsing (less alloc/free) and reducing memory usage
a little bit for configs containing tons of proxies/servers with long
config file names (each server and proxy used to keep its own copy of
the file name it appears in).
- the warning about deprecated legacy mailers was initially planned for
2.9 with a removal at 3.1, but since we forgot to emit the warning,
that postpones it by one year and the warning is now emitted.
- QUIC: the ACK processing was improved to better deal with out-of-order
acks. The problem is that if a buffer keeps even a single unacked packet,
it cannot be released and counts towards the allocation limit, which can
participate to limiting the send performance in lossy networks. Some
improvements were made in that area to better detect certain out-of-order
ACKs that still allow to release some buffers and restart sending earlier.
I'm told that a few more changes are still pending on this topic.
- logs: the log-steps can now be configured for proxies so as to emit
logs at different moments in a session/request lifetime (e.g. after
receipt of request, once the connection to the server is established
and at the end). See the "log-steps" keyword in the doc for more
details. Again on this front, a do-log action (to permit to send
arbitrary logs at any moment) is still under review and should be
merged soon, closing issue #401.
- the current and total number of streams is finally available in "show
info". That's something that we've often been missing since we've
started to support multiplexed protocols. Now it's a reality. However
I'm noting that the total counter is also named CurrStreams, so expect
that one to be changed soon ;-)
- a new action "set-retries" was added to dynamically change the desired
number of retries, from tcp-request or http-request rules. Some users
know that certain URLs will require more due to a temporary server's
flakiness, and doing so allows to deal with the problem without
needlessly raising the number for all requests.
- option httpchk now takes an optional 4th argument for the Host header
to send. Indeed since the dirty "\r\nHost:foobar" approach was dropped
during the health checks rework, some users missed this ability to send
the host name for the request. Now it will simply look like this:
option httpchk GET / HTTP/1.1 www.example.com
That's something we missed during the changes a while ago, and the
chance was made in a way to ease backporting, so it will probably go to
3.0, and maybe 2.9/2.8 if there's some demand and it appears riskless.
- the log-forwarder relies on the co_getline() function to read a TCP log
and it happens that this old function was initially designed to handle
CLI keywords and wasn't optimised for speed at all, which explained that
40% of the CPU was spent there! After a simple change it's now down to
about 5% and the TCP log forwarding is now about 40% faster.
- the ugly "trace" directives in a second "global" section, that required
setting "expose-experimental-directives" and was sometimes spitting
warnings is now gone, and for good: we now have a "traces" section to
put them all (without the experimental directive anymore). In addition,
the "trace" keyword finally supports multiple statements on a single
line, both in the config and on the CLI. So now it can be sufficient to
enter this to enable h1+h2 traces to ring buffer "buf1" for example:
traces
trace all sink buf1 level developer
trace h1 verbosity complete start now
trace h2 verbosity complete start now
Those who need to temporarily support both the older experimental
approach and the new one (e.g. to ease instantly rolling back to the
previous version) may use a .if statement that uses "traces" for the
new version or "global" + experimental for the old one:
.if version_atleast(3.1-dev9)
traces
.else
global
expose-experimental-directives
.endif
trace h1 verbosity ...
...
- finally, it is now possible to build with USE_BACKTRACE=1 without glibc,
and automatically fall back to our own version. It worked for me on
musl on aarch64. If there are other users of musl (e.g. on Alpine), any
feedback is welcome on this, particularly on x86_64. Indeed, if it works
we could enable it by default for 3.1 so that such users can report more
exploitable backtraces. In order to test, simply try to crash your
process from the CLI by creating an infinite loop in expert mode:
"expert-mode on;debug dev loop 9999"
After ~2 seconds the process will crash. When the backtrace works, you
should see one of the threads with a "call trace" line and a few lines:
call trace(17):
| 0x6a68fa [eb ba 0f 1f 40 00 41 54]: ha_thread_dump+0x8a/0x8c
| 0x6a73e6 [64 48 8b 53 10 64 48 8b]: ha_panic+0x86/0x4e0
| 0x7f005ea6b3a0 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x123a0
| 0x7ffd5736b6ac [48 c1 e8 26 89 c0 48 89]:
linux-vdso:__vdso_gettimeofday+0xac/0x2b0
| 0x6a4914 [48 89 e6 48 8d 7c 24 10]: main+0x1f9644
| 0x64a28f [85 c0 75 6f 49 8b 96 80]: main+0x19efbf
| 0x64b960 [48 c7 43 28 00 00 00 00]: main+0x1a0690
| 0x6c23c5 [8b 0d 25 db 5f 00 85 c9]: task_process_applet+0x275/0xa71
If none of this is available, if the dump double-crashes during the dump
or if there are just one or two useless lines, it means it does not work
and should remain on a case-by-case basis.
And I think that's about all. Regarding the pending stuff I'm aware of,
there's the do-log action, some upcoming updates for the send side of QUIC,
and some parts for the reworked master-worker mode. I managed to restart
working on the automatic RX window for H2, and found that we can still be
optimistic about it. At least it reached a state where streams can use
multiple buffers if allowed to. At least a static setting could already
work, and maybe the automatic window sizing algorithm will not take too
long to code. We'll see.
We're now in October, so the release is due in less than two months now.
I think that we should be reasonable and consider that the trickiest parts
(mainly master-worker, maybe a few other things) should be decided on
during the next two weeks, i.e. for dev10. And by the end of the month
we should switch to doc/cleanups etc and continue to develop in -next.
For those interested in certain specific features, please test them by
now, do not wait for 3.1.0 to figure it doesn't suit your use case or is
broken. It always takes more time to fix after the release than before!
Please find the usual URLs below :
Site index : https://www.haproxy.org/
Documentation : https://docs.haproxy.org/
Wiki : https://github.com/haproxy/wiki/wiki
Discourse : https://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Sources : https://www.haproxy.org/download/3.1/src/
Git repository : https://git.haproxy.org/git/haproxy.git/
Git Web browsing : https://git.haproxy.org/?p=haproxy.git
Changelog : https://www.haproxy.org/download/3.1/src/CHANGELOG
Dataplane API :
https://github.com/haproxytech/dataplaneapi/releases/latest
Pending bugs : https://www.haproxy.org/l/pending-bugs
Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs
Code reports : https://www.haproxy.org/l/code-reports
Latest builds : https://www.haproxy.org/l/dev-packages
Willy
---
Complete changelog :
Amaury Denoyelle (14):
MINOR: mux-quic: complete Tx infos for QCS dump
MINOR: quic: ensure txbuf realloc is only performed on empty buffer
MINOR: mux-quic: strengthen qcs_send_metadata() usage
MINOR: quic: remove unneeded notification of txbuf room
MINOR: quic: refactor MUX send notification
MEDIUM: quic: strengthen MUX send notification
MINOR: quic: refactor STREAM room notification
MINOR: quic: do not remove qc_stream_desc automatically on ACK handling
MINOR: quic: store streambuf in a streamdesc tree
MINOR: quic: move buffered ACK to streambuf
MEDIUM: quic: handle out-of-order ACK at streamdesc layer
MEDIUM: quic: refactor buffered STREAM ACK consuming
BUG/MINOR: mux-quic: fix crash on qcc_init() early return
BUG/MINOR: quic: fix trace on releasing STREAM frame after ack
Aurelien DARRAGON (14):
REGTESTS: log: fix log-profile.vtc
MEDIUM: mailers: warn about deprecated legacy mailers
MINOR: log: fix indent in strm_log()
MINOR: log: introduce extra log profile steps
MINOR: log: handle extra log origins in _process_send_log_override()
MINOR: log: introduce log_orig flags
MINOR: log: explicitly handle extra log origins as error when relevant
MINOR: log: support extra log origins for '%OG' alias
MINOR: proxy: add log_steps struct member
MINOR: log: introduce "log-steps" proxy keyword
MINOR: log: add log_orig_proxy() helper function
MEDIUM: log: consider log-steps proxy setting for existing log origins
DOC: config: document proxy "log-steps" keyword
REGTESTS: add a test for proxy "log-steps"
Christopher Faulet (15):
BUG/MEDIUM: cli: Be sure to catch immediate client abort
DEV: flags/applet: decode appctx flags
OPTIM: stconn: Don't pretend mux have more data to deliver on
EOI/EOS/ERROR
BUG/MINOR: mcli: Pretend the mux have more data to deliver between two
commands
MINOR: action: Export release_expr_int_action() release function
MINOR: stream: Rely on a per-stream max connection retries value
MINOR: stream: Support dynamic changes of the number of connection retries
MINOR: stream/stats: Expose the current number of streams in stats
MINOR: stream/stats: Expose the total number of streams ever created in
stats
MINOR: config/trace: Add a 'traces' section to declare debug traces
MINOR: trace: Be able to chain commands for a source in one line
MINOR: tcpcheck: Add support for an option host header value for httpchk
option
BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy
forwarding
MINOR: mux-h1: Use a dedicated function to conditionnaly set EOI flag on
SE
BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for
upgrade
Oliver Dala (1):
BUG/MEDIUM: cli: Deadlock when setting frontend maxconn
Valentine Krasnobaeva (2):
BUG/MINOR: cfgparse-global: fix allowed args number for setenv
MINOR: cfgparse-global: add dedicated parser for *env keywords
Willy Tarreau (21):
MINOR: tools: add minimal file name management
CLEANUP: stick-table: make the file location point to a global file name
MINOR: proxy: use the global file names for conf->file
CLEANUP: cfgparse: factor proxy vs log-forward collisions
BUG/MINOR: cfgparse: detect another uncaught case of duplicate defaults
MINOR: proxy: add a list of orphaned defaults sections
MEDIUM: cfgparse: drop duplicate named defaults sections after use
OPTIM: cfgparse: speed up duplicate server detection
MEDIUM: cfgparse: warn about deprecated use of duplicate server names
BUG/MINOR: server: shut down streams under thread isolation
BUG/MINOR: proxy: also make the cli and resolvers use the global name
Revert "BUG/MINOR: server: shut down streams under thread isolation"
MINOR: task: define two new one-shot events for use with WOKEN_OTHER or
MSG
BUG/MEDIUM: stream: make stream_shutdown() async-safe
BUG/MINOR: server: make sure the HMAINT state is part of MAINT
BUG/MINOR: queue: make sure that maintenance redispatches server queue
MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute()
BUILD: tools: only include execinfo.h for the real backtrace() function
MINOR: tools: do not attempt to use backtrace() on linux without glibc
OPTIM: channel: speed up co_getline()'s search of the end of line
BUG/MEDIUM: queue: always dequeue the backend when redistributing the
last server
---