[ANNOUNCE] haproxy-2.1-dev5

2019-11-15 Thread Willy Tarreau
Hi,

HAProxy 2.1-dev5 was released on 2019/11/15. It added 44 new commits
after version 2.1-dev4.

So far so good, things are calming down. It has been a while without a
2.1-only bug report, which is very encouraging. We're in good shape for a
release at the end of next week it seems. There are still some internal
stuff I'd like to document better (or at all), a little bit of cleanup to
perform in the myriad of #ifdef of the memory allocator, and the link to
the bugs page to add in the output of "haproxy -v". So depending on how
progress is made on this, it could be done late next week, or we'll be
lazy and emit a final -dev next week and a release the week after.

Bah I wanted to install it on haproxy.org to replace the freshly upgraded
2.0.9 and just found that I've been happily ignoring the warning about
deprecation of rspirep/reqirp for many months... do what I say not what I
do as they say... At least I could see that the error message is clear and
the instructions are helpful. So even just for this, you should give it a
try with "-c" on your existing setups to see if you should expect any late
surprise.

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Sources  : http://www.haproxy.org/download/2.1/src/
   Git repository   : http://git.haproxy.org/git/haproxy.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy.git
   Changelog: http://www.haproxy.org/download/2.1/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/

Willy
---
Complete changelog :
Baptiste Assmann (2):
  BUG/MINOR: action: do-resolve now use cached response
  BUG: dns: timeout resolve not applied for valid resolutions

Christopher Faulet (21):
  BUG/MEDIUM: mux-h1: Disable splicing for chunked messages
  BUG/MEDIUM: stream: Be sure to support splicing at the mux level to 
enable it
  MINOR: flt_trace: Rename macros to print trace messages
  MINOR: trace: Add a set of macros to trace events if HA is compiled with 
debug
  MEDIUM: stream/trace: Register a new trace source with its events
  BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams
  MINOR: http-ana: Remove the unused function http_reset_txn()
  BUG/MEDIUM: filters: Don't call TCP callbacks for HTX streams
  MEDIUM: filters: Adapt filters API to allow again TCP filtering on HTX 
streams
  MINOR: freq_ctr: Make the sliding window sums thread-safe
  MINOR: stream: Remove the lock on the proxy to update time stats
  MINOR: counters: Add fields to store the max observed for {q,c,d,t}_time
  MINOR: stats: Report max times in addition of the averages for sessions
  MINOR: contrib/prometheus-exporter: Report metrics about max times for 
sessions
  BUG/MINOR: contrib/prometheus-exporter: Rename some metrics
  MINOR: contrib/prometheus-exporter: report the number of idle conns per 
server
  DOC: Add missing stats fields in the management manual
  BUG/MINOR: mux-h1: Properly catch parsing errors on payload and trailers
  BUG/MINOR: mux-h1: Don't set CS_FL_EOS on a read0 when receiving data to 
pipe
  MINOR: mux-h1: Set EOI on the conn-stream when EOS is reported in TUNNEL 
state
  MINOR: sink: Set the default max length for a message to BUFSIZE

Cyril Bonté (1):
  DOC: fix date and http_date keywords syntax

Cédric Dufour (1):
  MINOR: stick-table: allow sc-set-gpt0 to set value from an expression

Frédéric Lécaille (1):
  MINOR: peers: Add "log" directive to "peers" section.

Jerome Magnin (1):
  BUG/MINOR: stream: init variables when the list is empty

Lukas Tribus (1):
  MINOR: doc: http-reuse connection pool fix

Olivier Houchard (2):
  BUG/MEDIUM: tasks: Make tasklet_remove_from_tasklet_list() no matter the 
tasklet.
  BUG/MEDIUM: Make sure we leave the session list in session_free().

William Lallemand (4):
  BUG/MEDIUM: ssl/cli: don't alloc path when cert not found
  BUG/MINOR: ssl/cli: unable to update a certificate without bundle 
extension
  BUG/MINOR: ssl/cli: fix an error when a file is not found
  MINOR: ssl/cli: replace the default_ctx during 'commit ssl cert'

Willy Tarreau (10):
  DOC: management: fix typo on "cache_lookups" stats output
  BUG/MINOR: queue/threads: make the queue unlinking atomic
  CLEANUP: session: slightly simplify idle connection cleanup logic
  MINOR: memory: also poison the area on freeing
  CLEANUP: cli: use srv_shutdown_streams() instead of open-coding it
  CLEANUP: stats: use srv_shutdown_streams() instead of open-coding it
  BUG/MEDIUM: listeners: always pause a listener on out-of-resource 
condition
  BUILD: contrib/da: remove an "unused" warning
  MINOR: ring: make the parse function automatically set the 

Re: [PATCH] BUG/MINOR: ssl: fix crt-list neg filter for openssl < 1.1.1

2019-11-15 Thread Willy Tarreau
On Wed, Nov 06, 2019 at 06:47:50PM +0100, Emmanuel Hocdet wrote:
> Hi,
> 
> Very difficult to trigger the bug, except with spécific test configuration 
> like:
> crt-list:
> cert.pem !www.dom.tld
> cert.pem *.dom.tld
> 
> If you can consider the patch.

Guys, I know that everyone has been very busy lately but at least giving
me indications like "yes", "no", "let me check", "do as you want" or
whatever could help. Letting candidate fixes rot for 9 days with no
response is not cool, and while it will always happen once in a while
anywhere, it systematically happens in the SSL subsystem. We definitely
need to improve this situation :-(

Now it's too late for 2.0.9 and 2.1-dev5 anyway.

Thanks,
Willy



Re: [PATCH v3] MINOR: stick-table: allow sc-set-gpt0 to set value from an expression

2019-11-15 Thread Willy Tarreau
On Fri, Nov 08, 2019 at 10:06:17AM +0100, Cédric Dufour wrote:
> You can go ahead with PATCH v3.

OK thanks, now merged!

> I triple-checked it against our use-case (along haproxy 2.0.5, the latest
> Ubuntu-packaged version which we base our re-packaging on) and all seems well.

You should definitely switch to haproxy.debian.net which is provided
by the same maintainers, but with *really* updated packages, unless of
course you like to live dangerously with bugs that only you and a few
other users of these packages experience :-)

In case you want to feel a shiver down your spine, here is the list
of the 100 known bugs affecting your currently packaged version:

http://www.haproxy.org/bugs/bugs-2.0.5.html

> Thank you very much for your help and merging.
> 
> Toute bonne journée par chez vous ;-)

You're welcome,
Willy



[ANNOUNCE] haproxy-2.0.9

2019-11-15 Thread Willy Tarreau
Hi,

HAProxy 2.0.9 was released on 2019/11/15. It added 33 new commits
after version 2.0.8.

Several problematic bugs still affecting 2.0 were found since 2.0.8 thus
it's better to get rid of them now before everyone has already updated and
has to do it again.

The main one affects the way outgoing connections are validated. We used to
face several issues when dealing with retries during the development of 2.0,
which have stacked upon each other until we figured they were wrong. Indeed,
Christopher found a case where haproxy could enter an endless loop while
trying to perform a connection retry after a protocol failure (typically
try to speak H2 to a server responding in H1). Well, the dog was watching,
quickly biting that offending loop, but still...

Another one concerns idle connections with threads. There is a very difficult
to meet but definitely present race in the code closing a session and releasing
the last connection to a server form this session. We managed to reproduce it
by mixing queues, random server errors and server session terminations, all at
maximum rate. The result is a double free of a struct srv_list which crashes
haproxy.

It was also reported that splicing was broken with chunked encoding, and this
revealed that we have a bit more complex work to do for 2.2 to fix it. For now
it's simply disabled for chunked encoding, which is rarely noticeable in
practice since most often, chunks do not come large enough to enable dynamic
splicing.

Once in a while, someone reports that one (or a few) thread eats 100% CPU
mostly in system, showing an strace output in which it's visible that
epoll_wait() reports activity for a listener but nothing is done. This bug
was finally identified, it could happen when at least two distinct listeners
are used to fill the process' connection limit. In this case, the one which
has reached saturation last would return without disabling itself, and be
called again immediately. Note that in such a case, the CPU usage is just a
byproduct of some limit already being reached, but it would definitely make
the troubleshooting harder.

Connection retries over H2 connections experiencing a failed handshake or a
GOAWAY frame were not possible because the data had already left. This was
now fixed.

The rest is a bit less important and has less impact. For those running on
2.1-dev, no need to downgrade, I'm going to issue another 2.1-dev ASAP.

Please find the usual URLs below :
   Site index   : http://www.haproxy.org/
   Discourse: http://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Sources  : http://www.haproxy.org/download/2.0/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.0.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.0.git
   Changelog: http://www.haproxy.org/download/2.0/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/

Last minute note, for those tracking the git repo, I messed up my initial git
push and had to do it again in force. Sorry about this. Thus do not worry
in case one automated git-pull script reports an error, the error was on my
side.

Willy
---
Complete changelog :
Baptiste Assmann (2):
  BUG/MINOR: action: do-resolve now use cached response
  BUG: dns: timeout resolve not applied for valid resolutions

Christopher Faulet (7):
  BUG/MINOR: mux-h2: Don't pretend mux buffers aren't full anymore if 
nothing sent
  BUG/MAJOR: stream-int: Don't receive data from mux until SI_ST_EST is 
reached
  BUG/MEDIUM: mux-h1: Disable splicing for chunked messages
  BUG/MEDIUM: stream: Be sure to support splicing at the mux level to 
enable it
  BUG/MEDIUM: stream: Be sure to release allocated captures for TCP streams
  BUG/MEDIUM: filters: Don't call TCP callbacks for HTX streams
  BUG/MINOR: mux-h1: Don't set CS_FL_EOS on a read0 when receiving data to 
pipe

Joao Morais (1):
  BUG/MINOR: config: Update cookie domain warn to RFC6265

Jérôme Magnin (2):
  DOC: management: document reuse and connect counters in the CSV format
  DOC: management: document cache_hits and cache_lookups in the CSV format

Lukas Tribus (1):
  MINOR: doc: http-reuse connection pool fix

Olivier Houchard (4):
  MINOR: mux: Add a new method to get informations about a mux.
  BUG/MEDIUM: stream_interface: Only use SI_ST_RDY when the mux is ready.
  BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully 
ready.
  BUG/MEDIUM: Make sure we leave the session list in session_free().

William Dauchy (1):
  MINOR: tcp: avoid confusion in time parsing init

William Lallemand (1):
  BUG/MINOR: cli: don't call the kw->io_release if kw->parse failed

Willy Tarreau (14):
  MINOR: config: warn on presence of "\n" in header values/replacements
  BUG/MINOR: mux-h2: do not emit logs on backend connections
  BUG/MINOR: spoe: 

native prometheus exporter: retrieving check_status

2019-11-15 Thread Pierre Cheynier
Hi list,

We've recently tried to switch to the native prometheus exporter, but went 
quickly stopped in our initiative given the output on one of our preprod server:

$ wc -l metrics.out 
1478543 metrics.out
$ ls -lh metrics.out 
-rw-r--r-- 1 pierre pierre 130M nov.  15 15:33 metrics.out

This is not only due to a large setup, but essentially related to server lines, 
since we extensively user server-templates for server addition/deletion at 
runtime.

# backend & servers number
$ echo "show stat -1 2 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l
1309
$ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | wc -l
36360
# But a lot of them are actually "waiting to be provisioned" (especially on 
this preprod environment)
$ echo "show stat -1 4 -1" | sudo socat stdio /var/lib/haproxy/stats | grep 
MAINT | wc -l
34113

We'll filter out the server metrics as a quick fix, and will hopefully submit 
something to do it natively, but we would also like to get your feedbacks about 
some use-cases we expected to solve with this native exporter.

Ultimately, one of them would be a great value-added for us: being able to 
count check_status types (and their values in the L7STS case) per backend.

So, there are 3 associated points:
* it's great to have new metrics (such as 
`haproxy_process_current_zlib_memory`), but we also noticed that some very 
useful ones were not present due to their type, example:
[ST_F_CHECK_STATUS]   = IST("untyped"),
What could be done to be able to retrieve them? (I thought about something 
similar to `HRSP_[1-5]XX`, where the different check status could be defined 
and counted).

* also for `check_status`, there is the case of L7STS and its associated values 
that are present in another field. Most probably it could benefit from a better 
representation in a prometheus output (thanks to labels)?

* what about getting some backend-level aggregation of server metrics, such as 
the one that was previously mentioned, to avoid retrieving all the server 
metrics but still be able to get some insights?
I'm thinking about an aggregation of some fields at backend level, which was 
not previously done with the CSV output.

Thanks for your feedbacks,

Pierre