[ANNOUNCE] haproxy-2.1-dev5
Hi, HAProxy 2.1-dev5 was released on 2019/11/15. It added 44 new commits after version 2.1-dev4. So far so good, things are calming down. It has been a while without a 2.1-only bug report, which is very encouraging. We're in good shape for a release at the end of next week it seems. There are still some internal stuff I'd like to document better (or at all), a little bit of cleanup to perform in the myriad of #ifdef of the memory allocator, and the link to the bugs page to add in the output of "haproxy -v". So depending on how progress is made on this, it could be done late next week, or we'll be lazy and emit a final -dev next week and a release the week after. Bah I wanted to install it on haproxy.org to replace the freshly upgraded 2.0.9 and just found that I've been happily ignoring the warning about deprecation of rspirep/reqirp for many months... do what I say not what I do as they say... At least I could see that the error message is clear and the instructions are helpful. So even just for this, you should give it a try with "-c" on your existing setups to see if you should expect any late surprise. Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse: http://discourse.haproxy.org/ Slack channel: https://slack.haproxy.org/ Issue tracker: https://github.com/haproxy/haproxy/issues Sources : http://www.haproxy.org/download/2.1/src/ Git repository : http://git.haproxy.org/git/haproxy.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy.git Changelog: http://www.haproxy.org/download/2.1/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Baptiste Assmann (2): BUG/MINOR: action: do-resolve now use cached response BUG: dns: timeout resolve not applied for valid resolutions Christopher Faulet (21): BUG/MEDIUM: mux-h1: Disable splicing for chunked messages BUG/MEDIUM: stream: Be sure to support splicing at the mux level to enable it MINOR: flt_trace: Rename macros to print trace messages MINOR: trace: Add a set of macros to trace events if HA is compiled with debug MEDIUM: stream/trace: Register a new trace source with its events BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams MINOR: http-ana: Remove the unused function http_reset_txn() BUG/MEDIUM: filters: Don't call TCP callbacks for HTX streams MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX streams MINOR: freq_ctr: Make the sliding window sums thread-safe MINOR: stream: Remove the lock on the proxy to update time stats MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time MINOR: stats: Report max times in addition of the averages for sessions MINOR: contrib/prometheus-exporter: Report metrics about max times for sessions BUG/MINOR: contrib/prometheus-exporter: Rename some metrics MINOR: contrib/prometheus-exporter: report the number of idle conns per server DOC: Add missing stats fields in the management manual BUG/MINOR: mux-h1: Properly catch parsing errors on payload and trailers BUG/MINOR: mux-h1: Don't set CS_FL_EOS on a read0 when receiving data to pipe MINOR: mux-h1: Set EOI on the conn-stream when EOS is reported in TUNNEL state MINOR: sink: Set the default max length for a message to BUFSIZE Cyril Bonté (1): DOC: fix date and http_date keywords syntax Cédric Dufour (1): MINOR: stick-table: allow sc-set-gpt0 to set value from an expression Frédéric Lécaille (1): MINOR: peers: Add "log" directive to "peers" section. Jerome Magnin (1): BUG/MINOR: stream: init variables when the list is empty Lukas Tribus (1): MINOR: doc: http-reuse connection pool fix Olivier Houchard (2): BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the tasklet. BUG/MEDIUM: Make sure we leave the session list in session_free(). William Lallemand (4): BUG/MEDIUM: ssl/cli: don't alloc path when cert not found BUG/MINOR: ssl/cli: unable to update a certificate without bundle extension BUG/MINOR: ssl/cli: fix an error when a file is not found MINOR: ssl/cli: replace the default_ctx during 'commit ssl cert' Willy Tarreau (10): DOC: management: fix typo on "cache_lookups" stats output BUG/MINOR: queue/threads: make the queue unlinking atomic CLEANUP: session: slightly simplify idle connection cleanup logic MINOR: memory: also poison the area on freeing CLEANUP: cli: use srv_shutdown_streams() instead of open-coding it CLEANUP: stats: use srv_shutdown_streams() instead of open-coding it BUG/MEDIUM: listeners: always pause a listener on out-of-resource condition BUILD: contrib/da: remove an "unused" warning MINOR: ring: make the parse function automatically set the
Re: [PATCH] BUG/MINOR: ssl: fix crt-list neg filter for openssl < 1.1.1
On Wed, Nov 06, 2019 at 06:47:50PM +0100, Emmanuel Hocdet wrote: > Hi, > > Very difficult to trigger the bug, except with spécific test configuration > like: > crt-list: > cert.pem !www.dom.tld > cert.pem *.dom.tld > > If you can consider the patch. Guys, I know that everyone has been very busy lately but at least giving me indications like "yes", "no", "let me check", "do as you want" or whatever could help. Letting candidate fixes rot for 9 days with no response is not cool, and while it will always happen once in a while anywhere, it systematically happens in the SSL subsystem. We definitely need to improve this situation :-( Now it's too late for 2.0.9 and 2.1-dev5 anyway. Thanks, Willy
Re: [PATCH v3] MINOR: stick-table: allow sc-set-gpt0 to set value from an expression
On Fri, Nov 08, 2019 at 10:06:17AM +0100, Cédric Dufour wrote: > You can go ahead with PATCH v3. OK thanks, now merged! > I triple-checked it against our use-case (along haproxy 2.0.5, the latest > Ubuntu-packaged version which we base our re-packaging on) and all seems well. You should definitely switch to haproxy.debian.net which is provided by the same maintainers, but with *really* updated packages, unless of course you like to live dangerously with bugs that only you and a few other users of these packages experience :-) In case you want to feel a shiver down your spine, here is the list of the 100 known bugs affecting your currently packaged version: http://www.haproxy.org/bugs/bugs-2.0.5.html > Thank you very much for your help and merging. > > Toute bonne journée par chez vous ;-) You're welcome, Willy
[ANNOUNCE] haproxy-2.0.9
Hi, HAProxy 2.0.9 was released on 2019/11/15. It added 33 new commits after version 2.0.8. Several problematic bugs still affecting 2.0 were found since 2.0.8 thus it's better to get rid of them now before everyone has already updated and has to do it again. The main one affects the way outgoing connections are validated. We used to face several issues when dealing with retries during the development of 2.0, which have stacked upon each other until we figured they were wrong. Indeed, Christopher found a case where haproxy could enter an endless loop while trying to perform a connection retry after a protocol failure (typically try to speak H2 to a server responding in H1). Well, the dog was watching, quickly biting that offending loop, but still... Another one concerns idle connections with threads. There is a very difficult to meet but definitely present race in the code closing a session and releasing the last connection to a server form this session. We managed to reproduce it by mixing queues, random server errors and server session terminations, all at maximum rate. The result is a double free of a struct srv_list which crashes haproxy. It was also reported that splicing was broken with chunked encoding, and this revealed that we have a bit more complex work to do for 2.2 to fix it. For now it's simply disabled for chunked encoding, which is rarely noticeable in practice since most often, chunks do not come large enough to enable dynamic splicing. Once in a while, someone reports that one (or a few) thread eats 100% CPU mostly in system, showing an strace output in which it's visible that epoll_wait() reports activity for a listener but nothing is done. This bug was finally identified, it could happen when at least two distinct listeners are used to fill the process' connection limit. In this case, the one which has reached saturation last would return without disabling itself, and be called again immediately. Note that in such a case, the CPU usage is just a byproduct of some limit already being reached, but it would definitely make the troubleshooting harder. Connection retries over H2 connections experiencing a failed handshake or a GOAWAY frame were not possible because the data had already left. This was now fixed. The rest is a bit less important and has less impact. For those running on 2.1-dev, no need to downgrade, I'm going to issue another 2.1-dev ASAP. Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse: http://discourse.haproxy.org/ Slack channel: https://slack.haproxy.org/ Issue tracker: https://github.com/haproxy/haproxy/issues Sources : http://www.haproxy.org/download/2.0/src/ Git repository : http://git.haproxy.org/git/haproxy-2.0.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy-2.0.git Changelog: http://www.haproxy.org/download/2.0/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Last minute note, for those tracking the git repo, I messed up my initial git push and had to do it again in force. Sorry about this. Thus do not worry in case one automated git-pull script reports an error, the error was on my side. Willy --- Complete changelog : Baptiste Assmann (2): BUG/MINOR: action: do-resolve now use cached response BUG: dns: timeout resolve not applied for valid resolutions Christopher Faulet (7): BUG/MINOR: mux-h2: Don't pretend mux buffers aren't full anymore if nothing sent BUG/MAJOR: stream-int: Don't receive data from mux until SI_ST_EST is reached BUG/MEDIUM: mux-h1: Disable splicing for chunked messages BUG/MEDIUM: stream: Be sure to support splicing at the mux level to enable it BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams BUG/MEDIUM: filters: Don't call TCP callbacks for HTX streams BUG/MINOR: mux-h1: Don't set CS_FL_EOS on a read0 when receiving data to pipe Joao Morais (1): BUG/MINOR: config: Update cookie domain warn to RFC6265 Jérôme Magnin (2): DOC: management: document reuse and connect counters in the CSV format DOC: management: document cache_hits and cache_lookups in the CSV format Lukas Tribus (1): MINOR: doc: http-reuse connection pool fix Olivier Houchard (4): MINOR: mux: Add a new method to get informations about a mux. BUG/MEDIUM: stream_interface: Only use SI_ST_RDY when the mux is ready. BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready. BUG/MEDIUM: Make sure we leave the session list in session_free(). William Dauchy (1): MINOR: tcp: avoid confusion in time parsing init William Lallemand (1): BUG/MINOR: cli: don't call the kw->io_release if kw->parse failed Willy Tarreau (14): MINOR: config: warn on presence of "\n" in header values/replacements BUG/MINOR: mux-h2: do not emit logs on backend connections BUG/MINOR: spoe:
native prometheus exporter: retrieving check_status
Hi list, We've recently tried to switch to the native prometheus exporter, but went quickly stopped in our initiative given the output on one of our preprod server: $ wc -l metrics.out 1478543 metrics.out $ ls -lh metrics.out -rw-r--r-- 1 pierre pierre 130M nov. 15 15:33 metrics.out This is not only due to a large setup, but essentially related to server lines, since we extensively user server-templates for server addition/deletion at runtime. # backend & servers number $ echo "show stat -1 2 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l 1309 $ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l 36360 # But a lot of them are actually "waiting to be provisioned" (especially on this preprod environment) $ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | grep MAINT | wc -l 34113 We'll filter out the server metrics as a quick fix, and will hopefully submit something to do it natively, but we would also like to get your feedbacks about some use-cases we expected to solve with this native exporter. Ultimately, one of them would be a great value-added for us: being able to count check_status types (and their values in the L7STS case) per backend. So, there are 3 associated points: * it's great to have new metrics (such as `haproxy_process_current_zlib_memory`), but we also noticed that some very useful ones were not present due to their type, example: [ST_F_CHECK_STATUS] = IST("untyped"), What could be done to be able to retrieve them? (I thought about something similar to `HRSP_[1-5]XX`, where the different check status could be defined and counted). * also for `check_status`, there is the case of L7STS and its associated values that are present in another field. Most probably it could benefit from a better representation in a prometheus output (thanks to labels)? * what about getting some backend-level aggregation of server metrics, such as the one that was previously mentioned, to avoid retrieving all the server metrics but still be able to get some insights? I'm thinking about an aggregation of some fields at backend level, which was not previously done with the CSV output. Thanks for your feedbacks, Pierre