Hi, HAProxy 2.0-dev2 was released on 2019/03/26. It added 176 new commits after version 2.0-dev1.
This version starts more important changes. One of the most visible ones is that haproxy will now automatically start with threads enabled if neither "nbthread" nor "nbproc" is configured. It will check the number of CPUs it's running on and will start as many threads. This means that it will not be necessary anymore to adjust the configuration to adjust the number of threads and the CPUs bound, by just setting the affinity in the service's configuration, haproxy will automatically adapt and use the same number of threads. On systems where haproxy cannot retrieve the affinity information it will still default to a single thread. A small byproduct of this change is that now nbproc and nbthread are exclusive. We experimented with both at the same time in 1.8 and it was totally pointless since it maintains all the problems caused by processes and causes many internal difficulties as one can imagine. I'm still thinking how we could further simplify the "process" and "cpu-map" directives based on this, though keeping "1/X" in the values is not a big deal. Some of you might be wondering "but how am I supposed to bind the listeners now?". The response is that now there is some thread load balancing in the listener's accept queue. Thus by default, you can have a single "bind" line with no "process" setting and multiple threads, and the accept() code will distribute the connections load to the threads based on their respective number of connections. I found this to address an issue I've been facing with h2load from the very beginning, by which the traffic was never evenly spread, and the test would start fast and slow down at the end, because some threads used to have more connections than other ones and at the end of the test only one or two threads were finishing alone. Now this issue is gone because all threads get the same number of connections, and the performance is extremely stable across tests. It's so stable that I managed to get more than one million requests per second out of the cache on my laptop ;-) One nice effect of this automatic traffic distribution is that haproxy can now much better share its CPUs with the network stack. In the past, either you had one single socket and the traffic was not evenly spread, or you had multiple sockets with a "process" directive and the traffic was distributed in round-robin by the system. But when some cores are highly loaded and others less (e.g. due to SSL traffic), the round robin gives quite bad results and overloads already loaded threads. Here instead the traffic remains very smooth since highly loaded threads will get less new connections. Of course it is still possible to continue to bind the sockets by hand, and it still gives slightly higher raw performance since it skips the incoming load balancing step. But when we're talking about hundreds of thousands of connections per second, most people don't care about a difference that sets limits 100 to 1000 times higher than their needs and I expect to see trivial configs re-appear over time. Another visible change in -dev2 is that the frontend's and global maxconn value are now automatically set. We're indeed seeing far too often people not set the global maxconn value, keeping an inappropriately low limit, and at the same time those like us who develop are used to see warnings all the time that their maxconn is too high for their ulimit. So this means that the default value of 2000 is suitable for nobody. So now what will be done when there's no maxconn is that the default value will be automatically calculated based on the number of FDs allocated to the process by "ulimit -n". This can be set in service settings on many systems so this is another resource limit that will not require a configuration change anymore. For example on systemd it seems to be LimitNOFILE. And now the frontend's default maxconn which most people don't set because they believe it's the same as the global maxconn, will be the global maxconn (configured or calculated). So this means that now, a config which doesn't set a maxconn will have the maximum number of possible conns set correctly by default. I've heard that maxconn was one of the most difficult setting to get right in Docker images, so let's hope that it will be much more turn-key now :-) This version also contains a significant number of fixes, par of which were already merged into 1.9.5 and others which I expect to see soon in 1.9.6. Among these fixes, we managed to address the trouble caused to the abortonclose option in 1.8 when H2 was introduced. In short, we now have a separate flag and don't pretend that an input stream is closed at the end of the request, we make the difference with an end of message input. We intend to backport this to the next 1.9 if no issue is reported, which I'm now confident in given the time we spent chasing various issues that this could address. There were quite a number of other issues in H2. One of them was a fairness problem which very likely is the cause of the uneven response times that was reported here by Ashwin. Due to the mux buffer having to take traffic from many streams, the list of pending streams was moved back and forth and failed streams were postponed at the end of the list until they possibly expired. This was addressed so that streams don't starve anymore and keep the fairness they had in 1.8. Now we don't see timeouts anymore on long transfers and the standard deviation on request time seems way lower. Another issue used to affect outgoing connections where an abort of the connection would not wake up the streams which didn't yet have an ID assigned, causing some apparent connect timeouts there. The HTX fast-forwarding (especially with H2) was re-enabled thanks to all the fixes that touched H2. The transfer performance will be much better, especially from servers which don't announce a content-length since we won't need to pass each frame through the analysers. Overall this release mostly touches the lower layers and serves as a preview for the upcoming 1.9 fixes. I'm now seeing the pieces of the puzzle assemble much better than they used to, and I'm seeing less brown paper bag on sensitive areas. This generally is a good indication that things are getting better and more solid! So if you're facing issues with H2 or HTX in 1.9 or 2.0-dev, please give this one a try (and keep in mind that it's development code so if you try it on a production server, make sure it's closely watched and that you keep enough servers on a stable version). I intend to backport the H2 and HTX fixes to 1.9 by the end of the week if everything goes well. Have fun! Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Sources : http://www.haproxy.org/download/2.0/src/ Git repository : http://git.haproxy.org/git/haproxy.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy.git Changelog : http://www.haproxy.org/download/2.0/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Christopher Faulet (31): BUG/MINOR: mux-h1: Don't report an error on EOS if no message was received BUG/MINOR: stats/htx: Call channel_add_input() when response headers are sent BUG/MINOR: lua/htx: Use channel_add_input() when response data are added BUG/MINOR: lua/htx: Don't forget to call htx_to_buf() when appropriate MINOR: stats: Add the status code STAT_STATUS_IVAL to handle invalid requests MINOR: stats: Move stuff about the stats status codes in stats files BUG/MINOR: stats: Be more strict on what is a valid request to the stats applet BUG/MAJOR: spoe: Fix initialization of thread-dependent fields BUG/MAJOR: stats: Fix how huge POST data are read from the channel BUG/MEDIUM: mux-h2: Always wakeup streams with no id to avoid frozen streams MINOR: mux-h2: Set REFUSED_STREAM error to reset a stream if no data was never sent MINOR: muxes: Report the Last read with a dedicated flag MINOR: proto-http/proto-htx: Make error handling clearer during data forwarding MEDIUM: proto_htx: Switch to infinite forwarding if there is no data filter BUG/MINOR: cache: Fully consume large requests in the cache applet BUG/MINOR: stats: Fully consume large requests in the stats applet BUG/MEDIUM: lua: Fully consume large requests when an HTTP applet ends MINOR: proto_http: Add function to handle the header "Expect: 100-continue" MINOR: proto_htx: Add function to handle the header "Expect: 100-continue" MINOR: stats/cache: Handle the header Expect when applets are registered MINOR: http/applets: Handle all applets intercepting HTTP requests the same way MINOR: lua: Don't handle the header Expect in lua HTTP applets anymore BUG/MINOR: proto-http: Don't forward request body anymore on error MINOR: mux-h2: Remove useless test on ES flag in h2_frt_transfer_data() MINOR: connection: and new flag to mark end of input (EOI) MINOR: channel: Report EOI on the input channel if it was reached in the mux MEDIUM: mux-h2: Don't mix the end of the message with the end of stream MINOR: mux-h1: Set CS_FL_EOI the end of the message is reached BUG/MEDIUM: http/htx: Fix handling of the option abortonclose CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and SI_FL_READ_NULL MEDIUM: proto_htx: Reintroduce the infinite forwarding on data Dragan Dosen (1): BUG/MEDIUM: 51d: fix possible segfault on deinit_51degrees() Frédéric Lécaille (11): BUG/MEDIUM: standard: Wrong reallocation size. MINOR: peers: Add a message for heartbeat. MINOR: sample: Replace "req.ungrpc" smp fetch by a "ungrpc" converter. MINOR: sample: Code factorization "ungrpc" converter. MINOR: sample: Rework gRPC converter code. MINOR: sample: Extract some protocol buffers specific code. DOC: Remove tabs and fixed punctuation. MINOR: sample: Add a protocol buffers specific converter. REGTEST: Peers reg tests. REGTEST: Enable reg tests with HEAD HTTP method usage. BUG/MAJOR: config: Wrong maxconn adjustment. Lukas Tribus (1): BUG/MINOR: ssl: fix warning about ssl-min/max-ver support Olivier Houchard (55): MINOR: lists: Implement locked variations. MEDIUM: servers: Used a locked list for idle_orphan_conns. MEDIUM: servers: Reorganize the way idle connections are cleaned. BUG/MEDIUM: lists: Properly handle the case we're removing the first elt. MINOR: cfgparse: Add a cast to make gcc happier. BUG/MEDIUM: logs: Only attempt to free startup_logs once. MINOR: fd: Remove debugging code. BUG/MEDIUM: listeners: Don't call fd_stop_recv() if fd_updt is NULL. MINOR: threads: Implement __ha_barrier_atomic*. MEDIUM: threads: Use __ATOMIC_SEQ_CST when using the newer atomic API. MINOR: threads: Add macros to do atomic operation with no memory barrier. MEDIUM: various: Use __ha_barrier_atomic* when relevant. MEDIUM: applets: Use the new _HA_ATOMIC_* macros. MEDIUM: xref: Use the new _HA_ATOMIC_* macros. MEDIUM: fd: Use the new _HA_ATOMIC_* macros. MEDIUM: freq_ctr: Use the new _HA_ATOMIC_* macros. MEDIUM: proxy: Use the new _HA_ATOMIC_* macros. MEDIUM: server: Use the new _HA_ATOMIC_* macros. MEDIUM: task: Use the new _HA_ATOMIC_* macros. MEDIUM: activity: Use the new _HA_ATOMIC_* macros. MEDIUM: backend: Use the new _HA_ATOMIC_* macros. MEDIUM: cache: Use the new _HA_ATOMIC_* macros. MEDIUM: checks: Use the new _HA_ATOMIC_* macros. MEDIUM: pollers: Use the new _HA_ATOMIC_* macros. MEDIUM: compression: Use the new _HA_ATOMIC_* macros. MEDIUM: spoe: Use the new _HA_ATOMIC_* macros. MEDIUM: threads: Use the new _HA_ATOMIC_* macros. MEDIUM: http: Use the new _HA_ATOMIC_* macros. MEDIUM: lb/threads: Use the new _HA_ATOMIC_* macros. MEDIUM: listeners: Use the new _HA_ATOMIC_* macros. MEDIUM: logs: Use the new _HA_ATOMIC_* macros. MEDIUM: memory: Use the new _HA_ATOMIC_* macros. MEDIUM: peers: Use the new _HA_ATOMIC_* macros. MEDIUM: proto_tcp: Use the new _HA_ATOMIC_* macros. MEDIUM: queues: Use the new _HA_ATOMIC_* macros. MEDIUM: sessions: Use the new _HA_ATOMIC_* macros. MEDIUM: ssl: Use the new _HA_ATOMIC_* macros. MEDIUM: stream: Use the new _HA_ATOMIC_* macros. MEDIUM: tcp_rules: Use the new _HA_ATOMIC_* macros. MEDIUM: time: Use the new _HA_ATOMIC_* macros. MEDIUM: vars: Use the new _HA_ATOMIC_* macros. MEDIUM: list: Remove useless barriers. MEDIUM: list: Use _HA_ATOMIC_* MEDIUM: connections: Use _HA_ATOMIC_* BUG/MAJOR: tasks: Use the TASK_GLOBAL flag to know if we're in the global rq. BUG/MEDIUM: tasks: Make sure we wake sleeping threads if needed. BUG/MINOR: doc: Be accurate on the behavior on pool-purge-delay. BUG/MEDIUM: mux-h2: Make sure we destroyed the h2s once shutr/shutw is done. BUG/MEDIUM: mux-h2: Don't bother keeping the h2s if detaching and nothing to send. BUG/MEDIUM: mux-h2: Use the right list in h2_stop_senders(). BUG/MEDIUM: h2: Try to be fair when sending data. BUG/MEDIUM: h2: only destroy the h2s if h2s->cs is NULL. BUG/MEDIUM: h2: Use the new sending_list in h2s_notify_send(). BUG/MEDIUM: h2: Follow the same logic in h2_deferred_shut than in h2_snd_buf. BUG/MEDIUM: h2: Remove the tasklet from the task list if unsubscribing. Pierre Cheynier (1): BUG/MEDIUM: ssl: ability to set TLS 1.3 ciphers using ssl-default-server-ciphersuites Radek Zajic (1): BUG/MINOR: log: properly format IPv6 address when LOG_OPT_HEXA modifier is used. Tim Duesterhus (2): CLEANUP: http: Remove unreachable code in parse_http_req_capture CLEANUP: stream: Remove bogus loop in conn_si_send_proxy Willy Tarreau (73): BUG/MINOR: listener: keep accept rate counters accurate under saturation DOC: fix alphabetic ordering for "tune.fail-alloc" setting MAJOR: config: disable support for nbproc and nbthread in parallel MEDIUM: listener: keep a single thread-mask and warn on "process" misuse MAJOR: listener: do not hold the listener lock in listener_accept() MINOR: listener: maintain a per-thread count of the number of connections on a listener MINOR: tools: implement functions to look up the nth bit set in a mask MINOR: listener: pre-compute some thread counts per bind_conf MINOR: listener: implement multi-queue accept for threads MAJOR: listener: use the multi-queue for multi-thread listeners MINOR: activity: add accept queue counters for pushed and overflows MINOR: config: add global tune.listener.multi-queue setting MAJOR: threads: enable one thread per CPU by default DOC: update management.txt to reflect that threads are used by default BUG/MINOR: config: don't over-count the global maxsock value BUG/MEDIUM: list: fix the rollback on addq in the locked liss BUG/MEDIUM: list: fix LIST_POP_LOCKED's removal of the last pointer BUG/MEDIUM: list: add missing store barriers when updating elements and head MINOR: list: make the delete and pop operations idempotent MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED BUG/MEDIUM: listener: use a self-locked list for the dequeue lists BUG/MEDIUM: listener: make sure the listener never accepts too many conns BUG/MEDIUM: list: correct fix for LIST_POP_LOCKED's removal of last element MINOR: listener: introduce listener_backlog() to report the backlog value MINOR: listener: do not needlessly set l->maxconn MINOR: proxy: do not change the listeners' maxconn when updating the frontend's MEDIUM: config: don't enforce a low frontend maxconn value anymore MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max values BUG/MINOR: init: never lower rlim_fd_max BUG/MINOR: checks: make external-checks restore the original rlim_fd_cur/max BUG/MINOR: mworker: be careful to restore the original rlim_fd_cur/max on reload MINOR: init: make the maxpipe computation more accurate MINOR: init: move some maxsock updates earlier MEDIUM: init: make the global maxconn default to what rlim_fd_cur permits REGTEST: fix a spurious "nbthread 4" in the connection test DOC: update the text related to the global maxconn value BUG/MAJOR: mux-h2: fix race condition between close on both ends BUG/MEDIUM: list: fix again LIST_ADDQ_LOCKED MINOR: htx: unconditionally handle parsing errors in requests or responses MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR between h2s and buf on RX BUG/MEDIUM: h2/htx: verify that :path doesn't contain invalid chars CLEANUP: wurfl: remove dead, broken and unmaintained code MINOR: config: relax the range checks on cpu-map MINOR: lists: add a LIST_DEL_INIT() macro MINOR: task: use LIST_DEL_INIT() to remove a task from the queue MINOR: listener: improve incoming traffic distribution MINOR: tools: implement my_flsl() MEDIUM: listener: change the LB algorithm again to use two round robins instead CLEANUP: listener: remove old thread bit mapping MINOR: listener: move thr_idx from the bind_conf to the listener OPTIM: task: limit the impact of memory barriers in taks_remove_from_task_list() MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn BUG/MEDIUM: list: fix incorrect pointer unlocking in LIST_DEL_LOCKED() BUG/MEDIUM: listener: make sure we don't pick stopped threads BUG/MEDIUM: threads/fd: do not forget to take into account epoll_fd/pipes BUG/MEDIUM: init/threads: consider epoll_fd/pipes for automatic maxconn calculation Revert "REGTEST: Enable reg tests with HEAD HTTP method usage." BUILD: listener: shut up a build warning when threads are disabled BUILD: Makefile: allow the reg-tests target to be verbose BUILD: Makefile: resolve LEVEL before calling run-regtests BUG/MINOR: http/counters: fix missing increment of fe->srv_aborts BUILD: tools: fix a build warning on some 32-bit archs MINOR: init: report the list of optionally available services CLEANUP: cache: don't export http_cache_applet anymore Revert "MEDIUM: proto_htx: Switch to infinite forwarding if there is no data filter" MINOR: mux-h2: copy small data blocks more often and reduce the number of pauses CLEANUP: mux-h2: add some comments to help understand the code BUG/MEDIUM: task/h2: add an idempotent task removal fucntion CLEANUP: task: only perform a LIST_DEL() when the list is not empty BUG/MEDIUM: mux-h2: make sure to always notify streams of EOS condition CONTRIB: debug: report the CS and CF's EOI flags MINOR: channel: don't unset CF_SHUTR_NOW after shutting down. ---