Hi,

HAProxy 2.0-dev2 was released on 2019/03/26. It added 176 new commits
after version 2.0-dev1.

This version starts more important changes. One of the most visible
ones is that haproxy will now automatically start with threads enabled
if neither "nbthread" nor "nbproc" is configured. It will check the
number of CPUs it's running on and will start as many threads. This
means that it will not be necessary anymore to adjust the configuration
to adjust the number of threads and the CPUs bound, by just setting the
affinity in the service's configuration, haproxy will automatically
adapt and use the same number of threads. On systems where haproxy
cannot retrieve the affinity information it will still default to a
single thread.

A small byproduct of this change is that now nbproc and nbthread are
exclusive. We experimented with both at the same time in 1.8 and it
was totally pointless since it maintains all the problems caused by
processes and causes many internal difficulties as one can imagine.
I'm still thinking how we could further simplify the "process" and
"cpu-map" directives based on this, though keeping "1/X" in the values
is not a big deal.

Some of you might be wondering "but how am I supposed to bind the
listeners now?". The response is that now there is some thread
load balancing in the listener's accept queue. Thus by default, you
can have a single "bind" line with no "process" setting and multiple
threads, and the accept() code will distribute the connections load to
the threads based on their respective number of connections. I found
this to address an issue I've been facing with h2load from the very
beginning, by which the traffic was never evenly spread, and the test
would start fast and slow down at the end, because some threads used
to have more connections than other ones and at the end of the test
only one or two threads were finishing alone. Now this issue is gone
because all threads get the same number of connections, and the
performance is extremely stable across tests. It's so stable that I
managed to get more than one million requests per second out of the
cache on my laptop ;-)

One nice effect of this automatic traffic distribution is that haproxy
can now much better share its CPUs with the network stack. In the past,
either you had one single socket and the traffic was not evenly spread,
or you had multiple sockets with a "process" directive and the traffic
was distributed in round-robin by the system. But when some cores are
highly loaded and others less (e.g. due to SSL traffic), the round
robin gives quite bad results and overloads already loaded threads.
Here instead the traffic remains very smooth since highly loaded
threads will get less new connections.

Of course it is still possible to continue to bind the sockets by hand,
and it still gives slightly higher raw performance since it skips the
incoming load balancing step. But when we're talking about hundreds of
thousands of connections per second, most people don't care about a
difference that sets limits 100 to 1000 times higher than their needs
and I expect to see trivial configs re-appear over time.

Another visible change in -dev2 is that the frontend's and global maxconn
value are now automatically set. We're indeed seeing far too often people
not set the global maxconn value, keeping an inappropriately low limit,
and at the same time those like us who develop are used to see warnings
all the time that their maxconn is too high for their ulimit. So this
means that the default value of 2000 is suitable for nobody. So now what
will be done when there's no maxconn is that the default value will be
automatically calculated based on the number of FDs allocated to the
process by "ulimit -n". This can be set in service settings on many
systems so this is another resource limit that will not require a
configuration change anymore. For example on systemd it seems to be
LimitNOFILE.

And now the frontend's default maxconn which most people don't set
because they believe it's the same as the global maxconn, will be the
global maxconn (configured or calculated). So this means that now, a
config which doesn't set a maxconn will have the maximum number of
possible conns set correctly by default. I've heard that maxconn was
one of the most difficult setting to get right in Docker images, so
let's hope that it will be much more turn-key now :-)

This version also contains a significant number of fixes, par of which
were already merged into 1.9.5 and others which I expect to see soon
in 1.9.6.

Among these fixes, we managed to address the trouble caused to the
abortonclose option in 1.8 when H2 was introduced. In short, we now
have a separate flag and don't pretend that an input stream is closed
at the end of the request, we make the difference with an end of message
input. We intend to backport this to the next 1.9 if no issue is
reported, which I'm now confident in given the time we spent chasing
various issues that this could address.

There were quite a number of other issues in H2. One of them was a
fairness problem which very likely is the cause of the uneven response
times that was reported here by Ashwin. Due to the mux buffer having to
take traffic from many streams, the list of pending streams was moved
back and forth and failed streams were postponed at the end of the list
until they possibly expired. This was addressed so that streams don't
starve anymore and keep the fairness they had in 1.8. Now we don't see
timeouts anymore on long transfers and the standard deviation on request
time seems way lower. Another issue used to affect outgoing connections
where an abort of the connection would not wake up the streams which
didn't yet have an ID assigned, causing some apparent connect timeouts
there.

The HTX fast-forwarding (especially with H2) was re-enabled thanks to
all the fixes that touched H2. The transfer performance will be much
better, especially from servers which don't announce a content-length
since we won't need to pass each frame through the analysers.

Overall this release mostly touches the lower layers and serves as a
preview for the upcoming 1.9 fixes. I'm now seeing the pieces of the
puzzle assemble much better than they used to, and I'm seeing less
brown paper bag on sensitive areas. This generally is a good indication
that things are getting better and more solid!

So if you're facing issues with H2 or HTX in 1.9 or 2.0-dev, please give
this one a try (and keep in mind that it's development code so if you
try it on a production server, make sure it's closely watched and that
you keep enough servers on a stable version). I intend to backport the
H2 and HTX fixes to 1.9 by the end of the week if everything goes well.

Have fun!

Please find the usual URLs below :
   Site index       : http://www.haproxy.org/
   Discourse        : http://discourse.haproxy.org/
   Slack channel    : https://slack.haproxy.org/
   Issue tracker    : https://github.com/haproxy/haproxy/issues
   Sources          : http://www.haproxy.org/download/2.0/src/
   Git repository   : http://git.haproxy.org/git/haproxy.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy.git
   Changelog        : http://www.haproxy.org/download/2.0/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/

Willy
---
Complete changelog :
Christopher Faulet (31):
      BUG/MINOR: mux-h1: Don't report an error on EOS if no message was received
      BUG/MINOR: stats/htx: Call channel_add_input() when response headers are 
sent
      BUG/MINOR: lua/htx: Use channel_add_input() when response data are added
      BUG/MINOR: lua/htx: Don't forget to call htx_to_buf() when appropriate
      MINOR: stats: Add the status code STAT_STATUS_IVAL to handle invalid 
requests
      MINOR: stats: Move stuff about the stats status codes in stats files
      BUG/MINOR: stats: Be more strict on what is a valid request to the stats 
applet
      BUG/MAJOR: spoe: Fix initialization of thread-dependent fields
      BUG/MAJOR: stats: Fix how huge POST data are read from the channel
      BUG/MEDIUM: mux-h2: Always wakeup streams with no id to avoid frozen 
streams
      MINOR: mux-h2: Set REFUSED_STREAM error to reset a stream if no data was 
never sent
      MINOR: muxes: Report the Last read with a dedicated flag
      MINOR: proto-http/proto-htx: Make error handling clearer during data 
forwarding
      MEDIUM: proto_htx: Switch to infinite forwarding if there is no data 
filter
      BUG/MINOR: cache: Fully consume large requests in the cache applet
      BUG/MINOR: stats: Fully consume large requests in the stats applet
      BUG/MEDIUM: lua: Fully consume large requests when an HTTP applet ends
      MINOR: proto_http: Add function to handle the header "Expect: 
100-continue"
      MINOR: proto_htx: Add function to handle the header "Expect: 100-continue"
      MINOR: stats/cache: Handle the header Expect when applets are registered
      MINOR: http/applets: Handle all applets intercepting HTTP requests the 
same way
      MINOR: lua: Don't handle the header Expect in lua HTTP applets anymore
      BUG/MINOR: proto-http: Don't forward request body anymore on error
      MINOR: mux-h2: Remove useless test on ES flag in h2_frt_transfer_data()
      MINOR: connection: and new flag to mark end of input (EOI)
      MINOR: channel: Report EOI on the input channel if it was reached in the 
mux
      MEDIUM: mux-h2: Don't mix the end of the message with the end of stream
      MINOR: mux-h1: Set CS_FL_EOI the end of the message is reached
      BUG/MEDIUM: http/htx: Fix handling of the option abortonclose
      CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and 
SI_FL_READ_NULL
      MEDIUM: proto_htx: Reintroduce the infinite forwarding on data

Dragan Dosen (1):
      BUG/MEDIUM: 51d: fix possible segfault on deinit_51degrees()

Frédéric Lécaille (11):
      BUG/MEDIUM: standard: Wrong reallocation size.
      MINOR: peers: Add a message for heartbeat.
      MINOR: sample: Replace "req.ungrpc" smp fetch by a "ungrpc" converter.
      MINOR: sample: Code factorization "ungrpc" converter.
      MINOR: sample: Rework gRPC converter code.
      MINOR: sample: Extract some protocol buffers specific code.
      DOC: Remove tabs and fixed punctuation.
      MINOR: sample: Add a protocol buffers specific converter.
      REGTEST: Peers reg tests.
      REGTEST: Enable reg tests with HEAD HTTP method usage.
      BUG/MAJOR: config: Wrong maxconn adjustment.

Lukas Tribus (1):
      BUG/MINOR: ssl: fix warning about ssl-min/max-ver support

Olivier Houchard (55):
      MINOR: lists: Implement locked variations.
      MEDIUM: servers: Used a locked list for idle_orphan_conns.
      MEDIUM: servers: Reorganize the way idle connections are cleaned.
      BUG/MEDIUM: lists: Properly handle the case we're removing the first elt.
      MINOR: cfgparse: Add a cast to make gcc happier.
      BUG/MEDIUM: logs: Only attempt to free startup_logs once.
      MINOR: fd: Remove debugging code.
      BUG/MEDIUM: listeners: Don't call fd_stop_recv() if fd_updt is NULL.
      MINOR: threads: Implement __ha_barrier_atomic*.
      MEDIUM: threads: Use __ATOMIC_SEQ_CST when using the newer atomic API.
      MINOR: threads: Add macros to do atomic operation with no memory barrier.
      MEDIUM: various: Use __ha_barrier_atomic* when relevant.
      MEDIUM: applets: Use the new _HA_ATOMIC_* macros.
      MEDIUM: xref: Use the new _HA_ATOMIC_* macros.
      MEDIUM: fd: Use the new _HA_ATOMIC_* macros.
      MEDIUM: freq_ctr: Use the new _HA_ATOMIC_* macros.
      MEDIUM: proxy: Use the new _HA_ATOMIC_* macros.
      MEDIUM: server: Use the new _HA_ATOMIC_* macros.
      MEDIUM: task: Use the new _HA_ATOMIC_* macros.
      MEDIUM: activity: Use the new _HA_ATOMIC_* macros.
      MEDIUM: backend: Use the new _HA_ATOMIC_* macros.
      MEDIUM: cache: Use the new _HA_ATOMIC_* macros.
      MEDIUM: checks: Use the new _HA_ATOMIC_* macros.
      MEDIUM: pollers: Use the new _HA_ATOMIC_* macros.
      MEDIUM: compression: Use the new _HA_ATOMIC_* macros.
      MEDIUM: spoe: Use the new _HA_ATOMIC_* macros.
      MEDIUM: threads: Use the new _HA_ATOMIC_* macros.
      MEDIUM: http: Use the new _HA_ATOMIC_* macros.
      MEDIUM: lb/threads: Use the new _HA_ATOMIC_* macros.
      MEDIUM: listeners: Use the new _HA_ATOMIC_* macros.
      MEDIUM: logs: Use the new _HA_ATOMIC_* macros.
      MEDIUM: memory: Use the new _HA_ATOMIC_* macros.
      MEDIUM: peers: Use the new _HA_ATOMIC_* macros.
      MEDIUM: proto_tcp: Use the new _HA_ATOMIC_* macros.
      MEDIUM: queues: Use the new _HA_ATOMIC_* macros.
      MEDIUM: sessions: Use the new _HA_ATOMIC_* macros.
      MEDIUM: ssl: Use the new _HA_ATOMIC_* macros.
      MEDIUM: stream: Use the new _HA_ATOMIC_* macros.
      MEDIUM: tcp_rules: Use the new _HA_ATOMIC_* macros.
      MEDIUM: time: Use the new _HA_ATOMIC_* macros.
      MEDIUM: vars: Use the new _HA_ATOMIC_* macros.
      MEDIUM: list: Remove useless barriers.
      MEDIUM: list: Use _HA_ATOMIC_*
      MEDIUM: connections: Use _HA_ATOMIC_*
      BUG/MAJOR: tasks: Use the TASK_GLOBAL flag to know if we're in the global 
rq.
      BUG/MEDIUM: tasks: Make sure we wake sleeping threads if needed.
      BUG/MINOR: doc: Be accurate on the behavior on pool-purge-delay.
      BUG/MEDIUM: mux-h2: Make sure we destroyed the h2s once shutr/shutw is 
done.
      BUG/MEDIUM: mux-h2: Don't bother keeping the h2s if detaching and nothing 
to send.
      BUG/MEDIUM: mux-h2: Use the right list in h2_stop_senders().
      BUG/MEDIUM: h2: Try to be fair when sending data.
      BUG/MEDIUM: h2: only destroy the h2s if h2s->cs is NULL.
      BUG/MEDIUM: h2: Use the new sending_list in h2s_notify_send().
      BUG/MEDIUM: h2: Follow the same logic in h2_deferred_shut than in 
h2_snd_buf.
      BUG/MEDIUM: h2: Remove the tasklet from the task list if unsubscribing.

Pierre Cheynier (1):
      BUG/MEDIUM: ssl: ability to set TLS 1.3 ciphers using 
ssl-default-server-ciphersuites

Radek Zajic (1):
      BUG/MINOR: log: properly format IPv6 address when LOG_OPT_HEXA modifier 
is used.

Tim Duesterhus (2):
      CLEANUP: http: Remove unreachable code in parse_http_req_capture
      CLEANUP: stream: Remove bogus loop in conn_si_send_proxy

Willy Tarreau (73):
      BUG/MINOR: listener: keep accept rate counters accurate under saturation
      DOC: fix alphabetic ordering for "tune.fail-alloc" setting
      MAJOR: config: disable support for nbproc and nbthread in parallel
      MEDIUM: listener: keep a single thread-mask and warn on "process" misuse
      MAJOR: listener: do not hold the listener lock in listener_accept()
      MINOR: listener: maintain a per-thread count of the number of connections 
on a listener
      MINOR: tools: implement functions to look up the nth bit set in a mask
      MINOR: listener: pre-compute some thread counts per bind_conf
      MINOR: listener: implement multi-queue accept for threads
      MAJOR: listener: use the multi-queue for multi-thread listeners
      MINOR: activity: add accept queue counters for pushed and overflows
      MINOR: config: add global tune.listener.multi-queue setting
      MAJOR: threads: enable one thread per CPU by default
      DOC: update management.txt to reflect that threads are used by default
      BUG/MINOR: config: don't over-count the global maxsock value
      BUG/MEDIUM: list: fix the rollback on addq in the locked liss
      BUG/MEDIUM: list: fix LIST_POP_LOCKED's removal of the last pointer
      BUG/MEDIUM: list: add missing store barriers when updating elements and 
head
      MINOR: list: make the delete and pop operations idempotent
      MINOR: server: remove a few unneeded LIST_INIT calls after LIST_DEL_LOCKED
      BUG/MEDIUM: listener: use a self-locked list for the dequeue lists
      BUG/MEDIUM: listener: make sure the listener never accepts too many conns
      BUG/MEDIUM: list: correct fix for LIST_POP_LOCKED's removal of last 
element
      MINOR: listener: introduce listener_backlog() to report the backlog value
      MINOR: listener: do not needlessly set l->maxconn
      MINOR: proxy: do not change the listeners' maxconn when updating the 
frontend's
      MEDIUM: config: don't enforce a low frontend maxconn value anymore
      MINOR: global: keep a copy of the initial rlim_fd_cur and rlim_fd_max 
values
      BUG/MINOR: init: never lower rlim_fd_max
      BUG/MINOR: checks: make external-checks restore the original 
rlim_fd_cur/max
      BUG/MINOR: mworker: be careful to restore the original rlim_fd_cur/max on 
reload
      MINOR: init: make the maxpipe computation more accurate
      MINOR: init: move some maxsock updates earlier
      MEDIUM: init: make the global maxconn default to what rlim_fd_cur permits
      REGTEST: fix a spurious "nbthread 4" in the connection test
      DOC: update the text related to the global maxconn value
      BUG/MAJOR: mux-h2: fix race condition between close on both ends
      BUG/MEDIUM: list: fix again LIST_ADDQ_LOCKED
      MINOR: htx: unconditionally handle parsing errors in requests or responses
      MINOR: mux-h2: always pass HTX_FL_PARSING_ERROR between h2s and buf on RX
      BUG/MEDIUM: h2/htx: verify that :path doesn't contain invalid chars
      CLEANUP: wurfl: remove dead, broken and unmaintained code
      MINOR: config: relax the range checks on cpu-map
      MINOR: lists: add a LIST_DEL_INIT() macro
      MINOR: task: use LIST_DEL_INIT() to remove a task from the queue
      MINOR: listener: improve incoming traffic distribution
      MINOR: tools: implement my_flsl()
      MEDIUM: listener: change the LB algorithm again to use two round robins 
instead
      CLEANUP: listener: remove old thread bit mapping
      MINOR: listener: move thr_idx from the bind_conf to the listener
      OPTIM: task: limit the impact of memory barriers in 
taks_remove_from_task_list()
      MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places
      MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum 
maxconn
      BUG/MEDIUM: list: fix incorrect pointer unlocking in LIST_DEL_LOCKED()
      BUG/MEDIUM: listener: make sure we don't pick stopped threads
      BUG/MEDIUM: threads/fd: do not forget to take into account epoll_fd/pipes
      BUG/MEDIUM: init/threads: consider epoll_fd/pipes for automatic maxconn 
calculation
      Revert "REGTEST: Enable reg tests with HEAD HTTP method usage."
      BUILD: listener: shut up a build warning when threads are disabled
      BUILD: Makefile: allow the reg-tests target to be verbose
      BUILD: Makefile: resolve LEVEL before calling run-regtests
      BUG/MINOR: http/counters: fix missing increment of fe->srv_aborts
      BUILD: tools: fix a build warning on some 32-bit archs
      MINOR: init: report the list of optionally available services
      CLEANUP: cache: don't export http_cache_applet anymore
      Revert "MEDIUM: proto_htx: Switch to infinite forwarding if there is no 
data filter"
      MINOR: mux-h2: copy small data blocks more often and reduce the number of 
pauses
      CLEANUP: mux-h2: add some comments to help understand the code
      BUG/MEDIUM: task/h2: add an idempotent task removal fucntion
      CLEANUP: task: only perform a LIST_DEL() when the list is not empty
      BUG/MEDIUM: mux-h2: make sure to always notify streams of EOS condition
      CONTRIB: debug: report the CS and CF's EOI flags
      MINOR: channel: don't unset CF_SHUTR_NOW after shutting down.

---

Reply via email to