Re: HAproxy load balancing query

2024-10-04 Thread Willy Tarreau
Hi Shehan,

On Fri, Oct 04, 2024 at 06:41:23AM +, Shehan Jayawardane wrote:
> Hi Josh,
> 
> Nice. Thank you for the information.
> And we are going to deploy this for one of our critical production
> environment. Where there will be around 7000 TPS. And we hope to have virtual
> machine as the HA proxy. Can we know how ,much of server resources we need
> for such deployment?

That's always very hard to say, especially since TPS don't mean much these
days in terms of sizing:
  - if that's just HTTP requests per second over persistent connections,
the smallest RaspberryPi is sufficient as long as the network link
is not saturated

  - if you need to renew TCP connections for each request ("close mode"),
then you may need a more robust system (RPi4B or above). But VMs are
notoriously not great for creating/tearing down TCP connections, and
you may observe significant differences between hypervisors and other
VMs running on them. You *will* need to test.

  - if these are SSL and you set up / tear down SSL connections with each
request, then 7k/s can be quite significant especially if rekeying is
needed. In this case, for a modern x86 CPU, you can count on no more
than 10k conn/s/core for resume or 2k conn/s/core for rekeying. VMs
have little to no impact on SSL processing costs, but that doesn't
remove the TCP setup/teardown costs. If you use SSL on both sides,
just double your estimates (even if that's normally not exact), and
stay away from OpenSSL 3.0.x.

And the next point is that usually with a load balancer, you need to take
into account not the initial load but the long-term target. We almost
never see LBs whose load is fading away over time, quite the opposite. I
tend to consider that the system load at deployment time shouldn't be
higher than 25%. That leaves enough margin for traffic progression, and
will leave you some room to try new features, apply some rules to work
around temporary security issues etc.

Finally, never under-estimate the total network traffic. Object sizes
tend to increase over time, especially on well-working sites where it's
common for people to feel at ease with large images and videos. Sometimes
you could just monitor your TPS without realizing that your LB has a
physical 1Gbps connection to the local network or ISP and that it fixes a
hard limit (in practise you'll start to get negative feedback above 70%
average usage due to short peaks). You don't want to upgrade in emergency
once it starts failing, because that means you'll have to do it after
failing in front of the largest possible number of witnesses, which is
never desirable! Just like for the rest, make sure that your initial
deployment doesn't use more than 25% of your absolute maximum capacity,
or make sure to have short-term plans to smoothly improve the situation.

Good luck!
Willy




Re: [PR] Fix dequeue proxy listeners deadlock master

2024-09-30 Thread Willy Tarreau
On Mon, Sep 30, 2024 at 09:52:17AM +0200, Oliver Dala wrote:
> Hi,
> i created a proper patch and sent it to the mailing list already. I
> created the PR before reading the collaboration guidelines. You can
> disregard this thread.

OK, thank you!
Willy




Re: [PR] Fix dequeue proxy listeners deadlock master

2024-09-30 Thread Willy Tarreau
Hi!

On Tue, Sep 24, 2024 at 02:23:03PM +, PR Bot wrote:
> From 16e8060fbdcf21beeb762096879f58b55afadc74 Mon Sep 17 00:00:00 2001
> From: Oliver Dala 
> Date: Tue, 24 Sep 2024 15:27:42 +0200
> Subject: [PATCH] pass PROXY_LOCK status through dequeue_proxy_listeners
> 
> ---
>  include/haproxy/listener.h | 2 +-
>  src/listener.c | 8 
>  src/proxy.c| 4 ++--
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/include/haproxy/listener.h b/include/haproxy/listener.h
> index 3627a791e85e..fb32f2e89b16 100644
> --- a/include/haproxy/listener.h
> +++ b/include/haproxy/listener.h
> @@ -94,7 +94,7 @@ void enable_listener(struct listener *listener);
>  void dequeue_all_listeners(void);
>  
>  /* Dequeues all listeners waiting for a resource in proxy 's queue */
> -void dequeue_proxy_listeners(struct proxy *px);
> +void dequeue_proxy_listeners(struct proxy *px, int lpx);
>  
>  /* This function closes the listening socket for the specified listener,
>   * provided that it's already in a listening state. The listener enters the
> diff --git a/src/listener.c b/src/listener.c
> index d32551a24d01..f9002cc7ea9c 100644
> --- a/src/listener.c
> +++ b/src/listener.c
> @@ -708,7 +708,7 @@ void dequeue_all_listeners()
>  }
>  
>  /* Dequeues all listeners waiting for a resource in proxy 's queue */
> -void dequeue_proxy_listeners(struct proxy *px)
> +void dequeue_proxy_listeners(struct proxy *px, int lpx)
>  {
>   struct listener *listener;
>  
> @@ -716,7 +716,7 @@ void dequeue_proxy_listeners(struct proxy *px)
>   /* This cannot fail because the listeners are by definition in
>* the LI_LIMITED state.
>*/
> - relax_listener(listener, 0, 0);
> + relax_listener(listener, lpx, 0);
>   }
>  }
>  
> @@ -1558,7 +1558,7 @@ void listener_accept(struct listener *l)
>  
>   if (p && !MT_LIST_ISEMPTY(&p->listener_queue) &&
>   (!p->fe_sps_lim || 
> freq_ctr_remain(&p->fe_counters.sess_per_sec, p->fe_sps_lim, 0) > 0))
> - dequeue_proxy_listeners(p);
> + dequeue_proxy_listeners(p, 0);
>   }
>   return;
>  
> @@ -1617,7 +1617,7 @@ void listener_release(struct listener *l)
>  
>   if (fe && !MT_LIST_ISEMPTY(&fe->listener_queue) &&
>   (!fe->fe_sps_lim || freq_ctr_remain(&fe->fe_counters.sess_per_sec, 
> fe->fe_sps_lim, 0) > 0))
> - dequeue_proxy_listeners(fe);
> + dequeue_proxy_listeners(fe, 0);
>   else {
>   unsigned int wait;
>   int expire = TICK_ETERNITY;
> diff --git a/src/proxy.c b/src/proxy.c
> index 25dd136dadd4..562bb827cc00 100644
> --- a/src/proxy.c
> +++ b/src/proxy.c
> @@ -1999,7 +1999,7 @@ struct task *manage_proxy(struct task *t, void 
> *context, unsigned int state)
>   }
>  
>   /* The proxy is not limited so we can re-enable any waiting listener */
> - dequeue_proxy_listeners(p);
> + dequeue_proxy_listeners(p, 0);
>   out:
>   t->expire = next;
>   task_queue(t);
> @@ -3010,7 +3010,7 @@ static int cli_parse_set_maxconn_frontend(char **args, 
> char *payload, struct app
>   }
>  
>   if (px->maxconn > px->feconn)
> - dequeue_proxy_listeners(px);
> + dequeue_proxy_listeners(px, 1);
>  
>   HA_RWLOCK_WRUNLOCK(PROXY_LOCK, &px->lock);
>  

Nice catch! Apparently it was introduced in 2.9 with commit ff1c803279
("BUG/MEDIUM: listener: Acquire proxy's lock in relax_listener() if
necessary") which was backported to 2.4, so it will need to be backported
there.

Do you have a commit message to propose that explains how the bug
manifests or should I do it myself ?

Thanks,
Willy




Re: Advice needed for investigating a potential bug

2024-09-27 Thread Willy Tarreau
On Fri, Sep 27, 2024 at 06:31:46PM +0200, Patrick Zwahlen wrote:
> Thanks Valentine for all that precious information.
> 
> I'll need some time and will probably start trying all the 2.9 versions to
> maybe identify the one that brought this conflict.
> 
> This is running in Docker and I only tested with the '2.9' tag, so it will
> be interesting to bisect the 2.9 releases first.

It should not be needed to test all 2.9 versions. 2.9.0 is sufficient to
know if the problem was in the original 2.9 or not. The possibility is
that it was the result of a later fix backported to 2.9 (hence after
2.9.0), though it's not very likely. Some of the changes that appeared
between 2.8 and 2.9 caused a few regressions (all identified ones fixed
by now), and some also uncovered old sleeping bugs. Yet it's still
possible you're facing one such bug. I suggest you take a look at HTTP
headers between harpoxy and elastic for the two versions of elastic and
see if they are 100% identical. Maybe you'll spot a difference that will
put us on a track.

Regards,
Willy




[ANNOUNCE] haproxy-3.1-dev8

2024-09-18 Thread Willy Tarreau
org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (1):
  BUG/MINOR: mux-quic: report glitches to session

Aurelien DARRAGON (7):
  BUG/MINOR: pattern: prevent const sample from being tampered in 
pat_match_beg()
  BUG/MEDIUM: pattern: prevent uninitialized reads in pat_match_{str,beg}
  BUG/MEDIUM: pattern: prevent UAF on reused pattern expr
  BUG/MINOR: peers: local entries updates may not be advertised after resync
  BUG/MINOR: fix missing "log-format overrides previous 'option tcplog 
clf'..." detection
  BUG/MINOR: fix missing "'option httpslog' overrides previous 'option 
tcplog clf'..." detection
  BUG/MINOR: cfgparse-listen: fix option httpslog override warning message

Christopher Faulet (12):
  MINOR: mux-h1: Set EOI on SE during demux when both side are in DONE state
  BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only
  REGTESTS: h1/h2: Update script testing H1/H2 protocol upgrades
  BUG/MAJOR: mux-h1: Wake SC to perform 0-copy forwarding in CLOSING state
  BUG/MINOR: h1-htx: Don't flag response as bodyless when a tunnel is 
established
  MEDIUM: h1: Accept invalid T-E values with accept-invalid-http-response 
option
  DOC: config: Explicitly list relaxing rules for accept-invalid-http-* 
options
  MINOR: proxy: Rename accept-invalid-http-* options
  DOC: configuration: Remove dangerous directives from the proxy matrix
  BUG/MEDIUM: sc_strm/applet: Wake applet after a successfull synchronous 
send
  BUG/MEDIUM: cache/stats: Wait to have the request before sending the 
response
  BUG/MEDIUM: promex: Wait to have the request before sending the response

Damien Claisse (2):
  MINOR: server: allow init-state for dynamic servers
      DOC: management: add init-state to add server keywords

William Lallemand (1):
  MEDIUM: ssl/cli: "dump ssl cert" allow to dump a certificate in PEM format

Willy Tarreau (27):
  DOC: configuration: place the HAPROXY_HTTP_LOG_FMT example on the correct 
line
  BUG/MEDIUM: clock: detect and cover jumps during execution
  REGTESTS: fix random failures with wrong_ip_port_logging.vtc under load
  BUG/MINOR: pattern: do not leave a leading comma on "set" error messages
  REGTESTS: shorten a bit the delay for the h1/h2 upgrade test
  DOC: server: document what to check for when adding new server keywords
  BUG/MINOR: polling: fix time reporting when using busy polling
  BUG/MINOR: clock: make time jump corrections a bit more accurate
  BUG/MINOR: clock: validate that now_offset still applies to the current 
date
  BUG/MEDIUM: queue: implement a flag to check for the dequeuing
  OPTIM: sample: don't check casts for samples of same type
  OPTIM: vars: remove the unneeded lock in vars_prune_*
  OPTIM: vars: inline vars_prune() to avoid many calls
  MINOR: vars: remove the emptiness tests in callers before pruning
  IMPORT: import cebtree (compact elastic binary trees)
  OPTIM: vars: use a cebtree instead of a list for variable names
  OPTIM: vars: use multiple name heads in the vars struct
  MINOR: clock: test all clock_gettime() return values
  MEDIUM: clock: collect the monotonic time in clock_local_update_date()
  MEDIUM: clock: opportunistically use CLOCK_MONOTONIC for the internal time
  MEDIUM: clock: use the monotonic clock for idle time calculation
  MEDIUM: clock: don't compute before_poll when using monotonic clock
  BUG/MINOR: cfgparse: detect incorrect overlap of same backend names
  MEDIUM: cfgparse: warn about proxies having the same names
  BUILD: cebtree: silence a bogus gcc warning on impossible code paths
  MEDIUM: cfgparse: warn about colliding names between defaults and proxies
  MEDIUM: cfgparse: detect collisions between defaults and log-forward

---




Re: [PATCH] DOC: management: add init-state to add server keywords

2024-09-17 Thread Willy Tarreau
On Tue, Sep 17, 2024 at 02:54:24PM +, Damien Claisse wrote:
> Commit ce6a621ae allowed init-state to be used for dynamic servers but I
> forgot to update management doc.

Applied, thank you Damien!
Willy




Re: haproxy-3.x.x - Ubuntu Focal

2024-09-13 Thread Willy Tarreau
On Wed, Sep 11, 2024 at 10:14:13AM +0200,  ??? wrote:
> ??, 11 . 2024 ?. ? 08:44, Alexis Vachette :
> 
> > Hi,
> >
> > Just wanted to know if you had a plan to release package for Ubuntu 20.04
> > Focal.
> >
> > Mostly because of OpenSSL 3.0 regression performance.
> >
> > The question is more for Vincent Bernat.
> >
> 
> I wonder what are your expectation of SSL lib for that package. I do not
> see good choice
> 
> 1) OpenSSL-1.1.1 (only limited QUIC unfortunately)
> 2) OpenSSL-3.X (bad perf)
> 3) QuicTLS (development frozen)
> 4) AWS-LC, WolfSSL, LibreSSL (requires efforts from packaging)
> 
> or, if you do not plan to use QUIC, OpenSSL-1.1.1 would b just nice

FWIW, at HaproxyTech, we decided to stick to OpenSSL-1.1.1 for now
and are providing packages built with it. We've already got several
reports of outages with 3.0 (not surprising) and are systematically
directing customers to the 1.1.1 packages to avoid any future problem.

I've read on the Ubuntu blog that they're going to maintain their 1.1.1
package up to 2030. I don't know if it's possible to install a package
of an older distro on a newer one, but that could be convenient.

Otherwise if you only want packaged stuff for Ubuntu 20, Vincent still
provides haproxy up to 2.9 (that includes 2.8 which is LTS). But at some
point you might have to build packages yourself if you need an extended
support on an older distro, or to update to a newer distro with a
different lib or version. It's sad but that the result of OpenSSL having
irresponsibly tagged 3.0 LTS before even testing it... It ended up in
distros while being totally unfit for a server, and distros can't easily
switch to an unsupported version, as security issues are even worse for
them than performance and stability issues.

Willy




Re: [PATCH] MINOR: server: allow init-state for dynamic servers

2024-09-10 Thread Willy Tarreau
On Tue, Sep 10, 2024 at 02:52:42PM +, Damien Claisse wrote:
> Commit 50322df introduced the init-state keyword, but it didn't enable
> it for dynamic servers. However, this feature is perfectly desirable
> for virtual servers too, where someone would like a server inlived
> through "set server be1/srv1 state ready" to be put out of maintenance
> in down state until the next health check succeeds.
> At reading the code, it seems that it's only a matter of allowing this
> keyword for dynamic servers, as current code path calls
> srv_adm_set_ready() which incidentally triggers a call to
> _srv_update_status_adm().

Merged, many thanks Damien, and sorry for not thinking more often about
checking dynamic servers support during reviews. It will come over time,
I just need to change 20 years-long habits :-)

Willy




Re: [PR] BUG fix in stream.c where counters will zero because of failed updates

2024-09-06 Thread Willy Tarreau
Hello!

On Fri, Sep 06, 2024 at 11:23:03AM +, PR Bot wrote:
> From d9d994eb1c6615b1f7ce7a0493be669ad8bb4ab0 Mon Sep 17 00:00:00 2001
> From: shakedm 
> Date: Fri, 6 Sep 2024 13:38:25 +0300
> Subject: [PATCH] fix a BUG in stream.c where counters will zero because of
>  failed updates
> 
> The current code only checked if bytes is initialized but during monitoring I
> found many instances in which the metrics will randomly drop to zero.
> this fix handles that scenario. I tested it on compiling and running the new
> binary and seems like it fixed the problem.

I disagree, it tests if bytes is non-zero, which is basically what you're
doing as well, look below:

> @@ -792,7 +792,7 @@ void stream_process_counters(struct stream *s)
>  
>   bytes = s->req.total - s->logs.bytes_in;
>   s->logs.bytes_in = s->req.total;
> - if (bytes) {
> + if (((signed long long) bytes) > 0) {
>   _HA_ATOMIC_ADD(&sess->fe->fe_counters.bytes_in, bytes);
>   _HA_ATOMIC_ADD(&s->be->be_counters.bytes_in,bytes);

The code checks if s->req.total changed since last time it was copied
into s->logs.bytes_in, and copies it there each time.

Your change istesting that it changed positively, but given that these
are 64-bit numbers, it's practically impossible to get negative numbers
here in one call. And I would even argue that the current test is better
since it could actually detect any change (even those large impractical
ones). So there's no problem.

You said you found instances where metrics will randomly drop to zero.
Did you find them when reading the code, or did you observe them yourself?
If the latter, then you've observed something else. The vast majority of
us are already monitoring stats, so if you've encountered a case where
such a problem happens, we'll need your config, version, and to know all
the specificities of your environment to figure how this could happen at
all.

Just thinking loud, are you sure that you're not experiencing reloads?
A reload starts a new process, and stats will restart from zero. In 3.1
there is a new option to restart the process from the current stats
counters, you might be interested in using this if that's what you're
facing.

Thanks!
Willy




Re: [PATCH v2] MEDIUM: server: add init-state

2024-09-06 Thread Willy Tarreau
On Wed, Sep 04, 2024 at 06:27:35PM -0400, Aaron Kuehler wrote:
> Allow the user to set the "initial state" of a server.
> 
> Context:
> 
> Servers are always set in an UP status by default. In
> some cases, further checks are required to determine if the server is
> ready to receive client traffic.
> 
> This introduces the "init-state {up|down}" configuration parameter to
> the server.
(...)

The patch looks good, the doc looks good and the commit message is clear
enough, that's fine, I've merged it. I'm going to mark the issue as fixed
now.

Many thanks!
Willy




[ANNOUNCE] haproxy-3.1-dev7

2024-09-06 Thread Willy Tarreau
rasnobaeva (3):
  MINOR: tools: add helpers to backup/clean/restore env
  MINOR: mworker: restore initial env before wait mode
  BUG/MINOR: haproxy: free init_env in deinit only if allocated

William Lallemand (6):
  BUILD: tools: environ is not defined in OS X and BSD
  CLEANUP: ssl: cleanup the clienthello capture
  MEDIUM: ssl: capture the supported_versions extension from Client Hello
  MEDIUM: ssl/sample: add ssl_fc_supported_versions_bin sample fetch
  MEDIUM: ssl: capture the signature_algorithms extension from Client Hello
  MEDIUM: ssl/sample: add ssl_fc_sigalgs_bin sample fetch

Willy Tarreau (8):
  BUILD: quic: fix build errors on FreeBSD since recent GSO changes
  MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places
  BUG/MAJOR: mux-h2: always clear MUX_MFULL and DEM_MROOM when clearing the 
mbuf
  BUG/MINOR: mux-spop: always clear MUX_MFULL and DEM_MROOM when clearing 
the mbuf
  DEV: patchbot: count the number of backported/non-backported patches
  DEV: patchbot: add direct links to show only specific categories
  DEV: patchbot: detect commit IDs starting with 7 chars
  BUG/MEDIUM: clock: also update the date offset on time jumps

---




Re: [PATCH] MEDIUM: server: add init-state

2024-09-04 Thread Willy Tarreau
Hi Pavlos,

On Wed, Sep 04, 2024 at 10:27:29AM +0200, Pavlos Parissis wrote:
> On Tue, 3 Sept 2024 at 22:26, Aaron Kuehler  
> wrote:
> >
> > Allow the user to set the "initial state" of a server.
> >
> > Context:
> >
> > Servers are always set in an UP status by default. In
> > some cases, further checks are required to determine if the server is
> > ready to receive client traffic.
> >
> > This introduces the "init-state {up|down}" configuration parameter to
> > the server.
> >
> > - when not set, the server is considered available as soon as a connection
> >   can be established.
> > - when set to 'up', the server is considered available as soon as a
> >   connection can be established.
> > - when set to 'down', the server is initially considered unavailable until
> >   it successfully completes its health checks.
> >
> > The server's init-state is considered when the HAProxy instance
> > is (re)started, a new server is detected (for example via service
> > discovery / DNS resolution), a server exits maintenance, etc.
> 
> Will this honour the functionality coming from load-server-state-from-file?
> I assume if users utilize the aforementioned setting this new functionality,
> which is great and thanks a lot for delivering it, will not kick in.

Logiclaly it should not change anything there since the state file already
contains the last state (srv_op_state), so the server whose state is loaded
from the file will be restored to its previous state. What is changed here
is the ability to set the server up/down when leaving maintenance. E.g. you
had put a server in maintenance state, you reinstalled it, then you enable
it from the CLI. Before it would instantly take traffic and perform a check
to possibly turn it off again, now you'll have the option to let it run its
checks before being used.

Hoping this helps,
Willy




Re: [ANNOUNCE] haproxy-3.0.4

2024-09-04 Thread Willy Tarreau
Hi,

On Tue, Sep 03, 2024 at 03:52:43PM +0200, Willy Tarreau wrote:
> HAProxy 3.0.4 was released on 2024/09/03. It added 42 new commits
> after version 3.0.3.
(...)
> Note that at this point this flushes the queue of pending bugs for 3.0,
> which is a good news. There remains one exception, a recently introduced
> QUIC patchset into 3.1 to implement NEW_TOKEN on 0-RTT that we'd like to
> backport since it addresses some bad corner cases. But the backport is
> non-trivial and the patches need to be exposed a bit longer in 3.1 first,
> this might come in 3.0.5.

I shouldn't have claimed that the queue was flushed, because in the
middle of the patches marked "backport after an observation period",
Christopher found a small cluster that could have been backported as
well with this release. Nothing critical, that doesn't deserve a new
release, but we'll probably make another one once the QUIC backports
above are done, in order to *really* flush the pipe this time. It will
likely be at the same time as the next 2.8 and lower then.

Sorry about that, it's always difficult to properly spot what's eligible
yet missing (we've improved the tooling again to help detect such misses
next time). All of these have been queued into the 3.0-maint and
2.9-maint branches.

In short, no worries, but don't be surprised if you were waiting for
3.0.4 regarding a specific fix and only find it queued for next release.

Willy




Re: [PATCH] MEDIUM: server: add init-state

2024-09-03 Thread Willy Tarreau
Hello Aaron,

On Tue, Sep 03, 2024 at 04:24:57PM -0400, Aaron Kuehler wrote:
> Allow the user to set the "initial state" of a server.

Thank you very much for working on this one!

> Context:
> 
> Servers are always set in an UP status by default. In
> some cases, further checks are required to determine if the server is
> ready to receive client traffic.
> 
> This introduces the "init-state {up|down}" configuration parameter to
> the server.
> 
> - when not set, the server is considered available as soon as a connection
>   can be established.
> - when set to 'up', the server is considered available as soon as a
>   connection can be established.
> - when set to 'down', the server is initially considered unavailable until
>   it successfully completes its health checks.

I'm having a difficulty understanding the first two cases, and it seems
from the patch that they are in fact the same. I think that "as soon as
a connection can be established" is confusing as it makes one think that
it still needs to perform one check (otherwise it's not clear what this
"connection" refers to).

Thus I suggest that the first two lines are merged into one with an
explanation like this one:

  - when set to 'up' (the default), the server is considered immediately
available and will initiate a health check that can turn it to the DOWN
state immediately if it fails.

And the same could be done in the doc, which contains the same text.

> The server's init-state is considered when the HAProxy instance
> is (re)started, a new server is detected (for example via service
> discovery / DNS resolution), a server exits maintenance, etc.

You're right, it's important to mention this because it's far from being
obvious.

By analogy with the "up" state, the "down" should be at check_rise-1 
instead of zero I think, so that it's sufficient to succeed one check
to turn it on. Otherwise it can take a lot of time to introduce some
servers which are slowly checked.

I'm just thinking about something, since I've read that complaint as well
a few times: in some environments, some servers are having difficulties
responding positively to initial health checks, and due to the way we're
starting at the check_rise value, the first failure is sufficient to turn
it down. A few users already asked for a way to turn the server completely
up. I think there could be value in having 3 different init states:
  - 'down': needs to validate one check before turning up
  - 'probe': the current situation, where it's up until the first test
makes it fail
  - 'up': it starts fully up (check_rise+check_fall-1 IIRC).

Or maybe even 4 states:
  - "fully-down": 0, requires that all checks are valid before it turns up
  - "down": check_rise-1, requires one successful check to turn up
  - "up": check_rise, requires one faulty check to turn down
  - "fully-up": check_rise+check_fall-1, requires that all checks fail to
turn down.

And we'd default to "up" like in your patch.

What do you think ?

BTW your patch is super clean, I've only found one tiny thing (just to
say that I found something):

> @@ -6504,7 +6526,11 @@ static int _srv_update_status_adm(struct server *s, 
> enum srv_adm_st_chg_cause ca
>*/
>   if (s->check.state & CHK_ST_ENABLED) {
>   s->check.state &= ~CHK_ST_PAUSED;
> - s->check.health = s->check.rise; /* start OK but check 
> immediately */
> + if(s->init_state == SRV_INIT_STATE_DOWN) {
 ^^
missing space here. :-)

> + s->check.health = 0; /* checks must pass before 
> the server is considered UP */
> + } else {
> + s->check.health = s->check.rise; /* start OK 
> but check immediately */
> + }
>   }

Thanks!
Willy




Re: [PATCH] CLEANUP: assorted typo fixes in the code and comments

2024-09-03 Thread Willy Tarreau
On Mon, Aug 26, 2024 at 11:40:15PM +0200, Ilya Shipitsin wrote:
> This is 43rd iteration of typo fixes

Merged, thank you Ilya!
Willy




[ANNOUNCE] haproxy-3.0.4

2024-09-03 Thread Willy Tarreau
traces, dumps and backtraces,
because many of the issues above were particularly hard to figure, let
alone reproduce, but the overall increasing level of details provided in
bug reports helps a lot! If some have suggestions about what can make
their lives easier when providing detailed traces, feel free to share
them!

Note that at this point this flushes the queue of pending bugs for 3.0,
which is a good news. There remains one exception, a recently introduced
QUIC patchset into 3.1 to implement NEW_TOKEN on 0-RTT that we'd like to
backport since it addresses some bad corner cases. But the backport is
non-trivial and the patches need to be exposed a bit longer in 3.1 first,
this might come in 3.0.5.

Please find the usual URLs below :
   Site index   : https://www.haproxy.org/
   Documentation: https://docs.haproxy.org/
   Wiki : https://github.com/haproxy/wiki/wiki
   Discourse: https://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Sources  : https://www.haproxy.org/download/3.0/src/
   Git repository   : https://git.haproxy.org/git/haproxy-3.0.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy-3.0.git
   Changelog: https://www.haproxy.org/download/3.0/src/CHANGELOG
   Dataplane API: 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs: https://www.haproxy.org/l/reviewed-bugs
   Code reports : https://www.haproxy.org/l/code-reports
   Latest builds: https://www.haproxy.org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (6):
  MINOR: proto: extend connection thread rebind API
  BUG/MEDIUM: quic: prevent crash on accept queue full
  CLEANUP: proto: rename TID affinity callbacks
  CLEANUP: quic: rename TID affinity elements
  BUG/MINOR: stick-table: fix crash for src_inc_gpc() without stkcounter
  DOC: quic: fix default minimal value for max window size

Aurelien DARRAGON (2):
  MEDIUM: sink: don't set NOLINGER flag on the outgoing stream interface
  MEDIUM: log: relax some checks and emit diag warnings instead in 
lf_expr_postcheck()

Christopher Faulet (12):
  BUG/MINOR: session: Eval L4/L5 rules defined in the default section
  BUG/MINOR: server: Don't warn fallback IP is used during init-addr 
resolution
  BUG/MINOR: cli: Atomically inc the global request counter between CLI 
commands
  BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the 
signature
  MINOR: proxy: Add support of 429-Too-Many-Requests in retry-on status
  BUG/MEDIUM: mux-h2: Set ES flag when necessary on 0-copy data forwarding
  BUG/MEDIUM: stream: Prevent mux upgrades if client connection is no 
longer ready
  BUG/MINIR: proxy: Match on 429 status when trying to perform a L7 retry
  BUG/MEDIUM: mux-pt: Never fully close the connection on shutdown
  BUG/MEDIUM: cli: Always release back endpoint between two commands on the 
mcli
  BUG/MEDIUM: mux-h1: Properly handle empty message when an error is 
triggered
  BUG/MEDIUM: mux-pt: Fix condition to perform a shutdown for writes in 
mux_pt_shut()

Frederic Lecaille (7):
  BUG/MINOR: quic: Non optimal first datagram.
  BUG/MINOR: quic: Lack of precision when computing K (cubic only cc)
  MINOR: quic: Dump TX in flight bytes vs window values ratio.
  MINOR: quic: Add information to "show quic" for CUBIC cc.
  BUG/MINOR: quic: unexploited retransmission cases for Initial pktns.
  BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only)
  BUG/MINOR: Crash on O-RTT RX packet after dropping Initial pktns

Lukas Tribus (1):
  DOC: install: don't reference removed CPU arg

Valentine Krasnobaeva (3):
  BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path
  MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD (take #2)
  BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn

William Lallemand (1):
  DOC: configuration: issuers-chain-path not compatible with OCSP

Willy Tarreau (10):
  BUILD: listener: silence a build warning about unused value without 
threads
  BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts
  BUG/MAJOR: mux-h2: force a hard error upon short read with pending error
  DOC: config: improve the http-keep-alive section
  MEDIUM: h1: allow to preserve keep-alive on T-E + C-L
  MINOR: queue: add a function to check for TOCTOU after queueing
  BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()
  Revert "MEDIUM: sink: don't set NOLINGER flag on the outgoing stream 
interface"
  MINOR: mux-h2: try to clear DEM_MROOM and MUX_MFULL at more places
  BUG/MAJOR: mux-h2: always clear MUX_MFULL and DEM_MROOM when clearing the 
mbuf

---




Re: Fwd: [PATCH v2 4/4] FEATURE: add MPTCP per address support

2024-08-30 Thread Willy Tarreau
On Fri, Aug 30, 2024 at 03:32:54PM +0200, Anthony Doeraene wrote:
> > Otherwise let me know if I can apply it as-is.
> 
> Seems good to me ! Thank you again for your reviews.

OK now pushed, thank you!

> > I'm very happy that after so long this topic has been alive, with your,
> > Matt's and Dorian's help we're finally reaching the end of the tunnel!
> 
> Thank you also for being open to these changes ! The fact that haproxy
> should support soon MPTCP could encourage even more application
> developers to use it, as it becomes easier for them to enable it on the
> server side :D. For the anecdote, we had a debate with some application
> developers that lacked support for MPTCP in haproxy just 3 days ago, that
> should help them too !

Ah! so if there are any bugs left, we should be aware soon ;-)

Have a nice week-end!
Willy




Re: [PATCH v2 4/4] FEATURE: add MPTCP per address support

2024-08-30 Thread Willy Tarreau
On Mon, Aug 26, 2024 at 11:50:27AM +0200, Aperence wrote:
> diff --git a/src/backend.c b/src/backend.c
> index 6956d9bfe..e4bd465e9 100644
> --- a/src/backend.c
> +++ b/src/backend.c
> @@ -1690,8 +1690,9 @@ int connect_server(struct stream *s)
>  
>   if (!srv_conn->xprt) {
>   /* set the correct protocol on the output stream connector */
> +
>   if (srv) {
> - if (conn_prepare(srv_conn, 
> protocol_lookup(srv_conn->dst->ss_family, PROTO_TYPE_STREAM, 0), srv->xprt)) {
> + if (conn_prepare(srv_conn, 
> protocol_lookup(srv_conn->dst->ss_family, PROTO_TYPE_STREAM, srv->alt_proto), 
> srv->xprt)) {
>   conn_free(srv_conn);
>   return SF_ERR_INTERNAL;
>   }

This one could have been left in a separate patch (like the 3/4) but as
we know that no other backend proto uses alt_proto right now, it really
has no impact.

Nothing else to say for the rest of the series. It's clean and looks
sane. I'm ready to merge it. I'd just adjust the severity subject tags
(MINOR for the first 2 and MEDIUM for the last two I think), and change
one word in the previous patch's commit message as mentioned.

If you have any last-minute change pending, feel free to suggest them
(I can apply them manually if they're trivial) or to resend the series.
Otherwise let me know if I can apply it as-is.

I'm very happy that after so long this topic has been alive, with your,
Matt's and Dorian's help we're finally reaching the end of the tunnel!

Thanks!
Willy




Re: [PATCH v2 3/4] REORG: use protocol when creating socket

2024-08-30 Thread Willy Tarreau
On Mon, Aug 26, 2024 at 11:50:26AM +0200, Aperence wrote:
> Use the protocol configured for a connection when creating the socket,
> instead of always using 0.
> 
> This change is needed to allow new protocol to be used when creating
> the sockets, such as MPTCP. Note however that this patch won't change
> anything for now, as the only value that proto->sock_prot could hold
> is IPPROTO_TCP, which has the same behavior as 0 when passed to socket.

To be precise, it's the only *other* value that can be IPPROTO_TCP,
because for unix sockets and socket pairs we're passing zero there,
and sock_prot is properly zero for them ;-)

I'll adjust that in the commit message myself if I take the series
as-is.

Willy




Re: [PATCH v2 4/4] FEATURE: add MPTCP per address support

2024-08-30 Thread Willy Tarreau
Hi Anthony,

On Mon, Aug 26, 2024 at 12:03:50PM +0200, Anthony Doeraene wrote:
> Hello,
> 
> Changelog compared to the previous patch:
> 
> - split some code to patches 1-3/4
> - use MPTCP with the backend if set in the backend config thanks to
> patches X/4
> - remove ".receivers" in the new MPTCP protocol structures.
> - split too large lines
> - fix family for 'mptcp6@' address

Thanks for the changes, that will help me, I'm back to reviewing the
changes again.

> Another question: Is it still needed to duplicate the mptcp protocol 
> structures
> from the tcp ones ? Could we not simply make a copy of the tcp ones and simply
> setting the name and sock_proto ?

It's better to avoid the copy. The reason is that it's too easy to
overlook something. For example there is a linked list element in the
struct which I used to mistakenly copy-paste, and that I recently
changed so that it's initialized by the register code. But if we later
add something similar, it will be very hard to spot the lack of special
case in the memcpy() code. The structs are not *that* large and remain
reasonably readable so let's have one per variant for now.

I also hesitated to set some defaults for many entries to avoid having
to declare all of them, but figured it would be hard not to set some
functions when we really want a NULL, and it would also make it harder
to figure all users of this or that function. I think what we have now
is far from being perfect but is a reasonably acceptable compromise in
fact.

Thanks!
Willy




Re: [PATCH] CLEANUP: haproxy: fix typos in code comment

2024-08-30 Thread Willy Tarreau
Hi Nicolas,

On Tue, Aug 27, 2024 at 10:18:51PM +0200, Nicolas CARPi wrote:
> Found these two typos while browsing the code :)
(...)
> Found this typo in macro name :)

Thank you, both patches applied.

> BTW, in mqtt.c mqtt_read_varint(), "off" is initialized to 0, but 
> initialized again in the for loop, which results in a dead assignment.
> 
> I believe this is done on purpose, as some sort of defensive
> programming, so I didn't touch it, but wanted to bring attention to it, 
> in case it is not something wanted.

Indeed it's not the prettiest but not a problem either. Sometimes
this can happen as the result of removing a check and an error path
for example. Let's leave it as is, it has no effect anyway :-)

Thank you!
Willy




Re: [PATCH 3/3] CI: QUIC Interop AWS-LC: enable ngtcp2 client

2024-08-24 Thread Willy Tarreau
On Sat, Aug 24, 2024 at 07:13:09PM +0200, Willy Tarreau wrote:
> On Sat, Aug 24, 2024 at 07:02:30PM +0200,  ??? wrote:
> > the reason of adding ngtcp2 is
> > 
> > "let's add and see how it goes".
> > 
> > and that is what we agreed in GH issue :)
> 
> That's fine by me, thank you!

Apparently it doesn't like it much:

  https://github.com/haproxy/haproxy/actions/runs/10540196449/workflow

Willy




Re: [PATCH 3/3] CI: QUIC Interop AWS-LC: enable ngtcp2 client

2024-08-24 Thread Willy Tarreau
On Sat, Aug 24, 2024 at 07:02:30PM +0200,  ??? wrote:
> the reason of adding ngtcp2 is
> 
> "let's add and see how it goes".
> 
> and that is what we agreed in GH issue :)

That's fine by me, thank you!
Willy




Re: [PATCH 3/3] CI: QUIC Interop AWS-LC: enable ngtcp2 client

2024-08-24 Thread Willy Tarreau
On Sat, Aug 24, 2024 at 06:15:59PM +0200,  ??? wrote:
> it is pretty much about it.
> short commit message "CI: QUIC Interop AWS-LC: enable ngtcp2 client"
> describes what is done, GH issue is optional.

But it does not explain why (or what the purpose is). As I often say
it, commit messages must explain the intent, much more than the what.
I know it's less important for CI which is a bit of a side project,
but regardless, I'm wondering things like "what's the impact, will
this regularly break some tests or take more time to build?" then
"but if so, what's the purpose first?".

I'm perfectly fine with simple explanations like "we have no other
QUIC client and we need one", or "it's the simplest QUIC client we've
found for now", or "ngtcp2's client provides stats others do not", or
"let's start with this one for now, we may change it later" etc.
See ?

Thanks!
Willy




Re: [PATCH 3/3] CI: QUIC Interop AWS-LC: enable ngtcp2 client

2024-08-24 Thread Willy Tarreau
Hi Ilya,

On Sat, Aug 24, 2024 at 03:55:45PM +0200, Ilya Shipitsin wrote:
> GH issue: https://github.com/haproxy/haproxy/issues/2688

While it's fine to put references to GH issues to ease tracking, please
always leave a bit of information in the commit message about what the
intent of the patch is. Here I really have no idea, the issue above is
long and discusses various things. In addition, if one day GH issues
become inaccessible for any reason, the history is lost.

Thanks!
Willy




Re: [PATCH 0/2] Add MPTCP to haproxy

2024-08-23 Thread Willy Tarreau
On Fri, Aug 23, 2024 at 05:11:11PM +0200, Matthieu Baerts wrote:
> >>> With that said, from an implementation perspective, it would seem right
> >>> to make sure that most TCP tunables also work with MPTCP.
> >>
> >> That's what we tried to do. All "common" ones are supported, but it is
> >> not always easy to define what is "common" and what is not! There are
> >> many obscure/specific socket options out there :)
> > 
> > I agree :-) For example we're using other horrors such as TCP_REPAIR,
> > which was initially defined for CRIU, and that we're using to silently
> > destroy a connection without responding. It's extremely effective against
> > HTTP botnets as they keep their connection pending and do not consume
> > anything locally. I don't know if you have it, but if you're interested
> > in giving it a try, I can help you set it up in haproxy for a test.
> 
> Wow, nice bazooka :-)

:-)

> Here is what we can currently find in the MPTCP code in the kernel:
> 
>   /* TCP_REPAIR, TCP_REPAIR_QUEUE, TCP_QUEUE_SEQ, TCP_REPAIR_OPTIONS,
>* TCP_REPAIR_WINDOW are not supported, better avoid this mess
>*/
> 
> Maybe a new socket option would be better if the idea is only to
> silently drop connections? :)

Yes, probably. Right now it's done directly in the action itself
(tcp_exec_action_silent_drop()), and we already detect if the option
is supported, so that would just be a matter of checking more
explicitly for conn->ctrl->proto == mptcp and using the other option.

In any case the code *tries* to make use of this, and will fall back
to other methods. One of them is to try to change the TTL of the
socket before closing (so that the RST doesn't reach the client).
And when all of this fails, the connection is closed anyway, so it's
just not silent. As such, there's no harm at all in not supporting this
at all, it's just less optimal.

> (If these botnets are using "plain" TCP, the TCP_REPAIR socket option
> will work!)

Ah, good to know! I guess we still have quite some time ahead before
they start playing with mptcp by default, so for most use cases it will
be fine!

Willy




Re: [PATCH 2/2] BUG/MINOR: fix warning when setting MSS with MPTCP

2024-08-23 Thread Willy Tarreau
On Fri, Aug 23, 2024 at 04:50:33PM +0200, Willy Tarreau wrote:
> > @@ -494,6 +498,30 @@ static void sock_inet_prepare()
> >  #endif
> > close(fd);
> > }
> > +
> > +#ifdef __linux__
> 
> Here I think a short comment is deserved to explain why __linux__, because
> it's the same choice that was made in proto_tcp.c.
> 
> Or better, you could probably do that in compat.h:
> 
> #ifdef __linux__
> # define HA_HAVE_MPTCP 1
> # ifndef IPPROTO_TCP
> #  define IPPROTO_TCP ...
> # endif
> #endif
> 
> and use #ifdef HA_HAVE_MPTCP here and around the variables above. That
> would make it quite clear about what we're really checking for here.

Oh and BTW adding a link in a comment to the issue that Matthieu just
opened regarding TCP_MSS support would help the future us rediscovering
that code decide whether to remerge that with TCP or not once fully
supported everywhere.

Willy




Re: [PATCH 2/2] BUG/MINOR: fix warning when setting MSS with MPTCP

2024-08-23 Thread Willy Tarreau
On Fri, Aug 23, 2024 at 03:34:10PM +0200, Aperence wrote:
> Currently, the TCP_MAXSEG socket option doesn't seem to be supported
> with MPTCP. This results in a warning when trying to set the MSS of
> sockets in proto_tcp:tcp_bind_listener.
> 
> This can be resolved by adding two new variables:
> sock_inet(6)_mptcp_maxseg_default that will hold the default
> value of the TCP_MAXSEG option. Note that for the moment, this
> will always be -1 as the option isn't supported. However, in the
> future, when the support for this option will be added, it should
> contain the correct value for the MSS, allowing to correctly
> set the TCP_MAXSEG option.

I'm perfectly fine with the whole patch except that it's not logical
to add it as a bug fix after the first one which raises the API
incompatibility. I think it should instead just be part of the first
one (unless you figure a way to introduce it before, which I doubt)
and you can simply append the commit message to the first one explaining
this specific case.

A few nits below:

> diff --git a/src/sock_inet.c b/src/sock_inet.c
> index 028ffaa68..9f10c5654 100644
> --- a/src/sock_inet.c
> +++ b/src/sock_inet.c
> @@ -77,6 +77,10 @@ int sock_inet6_v6only_default = 0;
>  int sock_inet_tcp_maxseg_default = -1;
>  int sock_inet6_tcp_maxseg_default = -1;
>  
> +/* Default MPTCPv4/MPTCPv6 MSS settings. -1=unknown. */
> +int sock_inet_mptcp_maxseg_default = -1;
> +int sock_inet6_mptcp_maxseg_default = -1;
> +

Maybe these should be conditioned to the use of MPTCP (just saving 8
bytes but...).

>  /* Compares two AF_INET sockaddr addresses. Returns 0 if they match or 
> non-zero
>   * if they do not match.
>   */
> @@ -494,6 +498,30 @@ static void sock_inet_prepare()
>  #endif
>   close(fd);
>   }
> +
> +#ifdef __linux__

Here I think a short comment is deserved to explain why __linux__, because
it's the same choice that was made in proto_tcp.c.

Or better, you could probably do that in compat.h:

#ifdef __linux__
# define HA_HAVE_MPTCP 1
# ifndef IPPROTO_TCP
#  define IPPROTO_TCP ...
# endif
#endif

and use #ifdef HA_HAVE_MPTCP here and around the variables above. That
would make it quite clear about what we're really checking for here.

Thanks!
Willy




Re: [PATCH 0/2] Add MPTCP to haproxy

2024-08-23 Thread Willy Tarreau
Hi Matthieu,

On Fri, Aug 23, 2024 at 04:13:16PM +0200, Matthieu Baerts wrote:
> Hi Willy,
> 
> Thank you for your quick reply!

You're welcome!

> > I'll comment on each patch separately,
> 
> Thank you, please take your time!

That's what I'm doing but I really want to make sure we won't discover
last-minute show-stoppers that require important changes. It's still
possible to do some important ones for a few weeks, so let's tackle all
roadblocks while we can. I really feel like we're on a good track now!

(... mss ...)

> I just filled a new ticket on MPTCP side, not to forget about that:
> 
>   https://github.com/multipath-tcp/mptcp_net-next/issues/515

OK!

> > With that said, from an implementation perspective, it would seem right
> > to make sure that most TCP tunables also work with MPTCP.
> 
> That's what we tried to do. All "common" ones are supported, but it is
> not always easy to define what is "common" and what is not! There are
> many obscure/specific socket options out there :)

I agree :-) For example we're using other horrors such as TCP_REPAIR,
which was initially defined for CRIU, and that we're using to silently
destroy a connection without responding. It's extremely effective against
HTTP botnets as they keep their connection pending and do not consume
anything locally. I don't know if you have it, but if you're interested
in giving it a try, I can help you set it up in haproxy for a test.

Cheers,
Willy




Re: [PATCH 1/2] FEATURE: add MPTCP per address support

2024-08-23 Thread Willy Tarreau
On Fri, Aug 23, 2024 at 03:34:09PM +0200, Aperence wrote:
(...)
> MPTCP is both supported for the frontend and backend sides.

Great!

> Also added an example of configuration using mptcp along with a backend
> allowing to experiment with it.

Thanks for thinking about testing ;-)

> Note that this is a re-implementation of Björn's work from 3 years ago 
> [4], when haproxy's internals were probably less ready to deal with 
> this, causing his work to be left pending for a while.

Thanks for mentioning his initial work!

> diff --git a/include/haproxy/compat.h b/include/haproxy/compat.h
> index 3829060b7..2347d62cb 100644
> --- a/include/haproxy/compat.h
> +++ b/include/haproxy/compat.h
> @@ -317,6 +317,11 @@ typedef struct { } empty_t;
>  #define queue _queue
>  #endif
>  
> +/* only Linux defines IPPROTO_MPTCP */
> +#ifndef IPPROTO_MPTCP
> +#define IPPROTO_MPTCP 262
> +#endif

Stupid open question: do we really want to define it for other
OSes ? I really don't care, to be honest.

> diff --git a/src/backend.c b/src/backend.c
> index 6956d9bfe..6b865768e 100644
> --- a/src/backend.c
> +++ b/src/backend.c
> @@ -1690,8 +1690,15 @@ int connect_server(struct stream *s)
>  
>   if (!srv_conn->xprt) {
>   /* set the correct protocol on the output stream connector */
> + int mptcp = 0;
> +
> + /* cli_conn can be NULL when the origin of the stream isn't a 
> +  * connection, there's no reason to use MPTCP in this case */
> + if (cli_conn && cli_conn->ctrl) 
> + mptcp = cli_conn->ctrl->sock_prot == IPPROTO_MPTCP;

I don't understand this part, it seems to imply a relation between the
client and the server. I'm not seeing any case where this could be valid
since the config of the client and the server are two totally independent
things.

I'm suspecting that the difficulty here is to know that the server's
address was configured as MPTCP and that since you couldn't find where
to store that piece of info, instead it's inherited from the client. But
that cannot be right as it's really a proxy and there's no relation
between the protocols used on each side. For example it's valid and
normal to have QUIC on the front and HTTP/2 over TCP (or MPTCP soon)
on the back.

If that's the issue, I suspect that we'll have to store the info in
the struct server itself that we want to use mptcp. BTW, based on
the extension you brought to protocol_lookup(), maybe we can have a
more generic field like "alt_proto" or something like this that is
passed directly to protocol_lookup(). This info needs to be
retrieved from the address parser called by _srv_parse_init(). I'm
seeing nowhere to store the "alt" value, and that might be an
indication that str2sa_range() needs to be extended to take yet
another arg to return that one, which you're already setting to
!!mptcp during the lookup.

As such I would suggest the following:
  - a preliminary patch extending str2sa_range() to add an int *alt
before the last "opts" parameter, and that is filled with the
value passed to protocol_lookup() during the function if the
pointer is not NULL. You can put in that patch the change of
definition of protocol_lookup() to use alt instead of dgram BTW.

  - a second (small) patch adding a field alt_proto to the "server"
struct (probably close to net_ns, xprt etc). This field would then
be filled in _srv_parse_init() from the "alt" argument returned by
str2sa_range().

  - then this current patch becomes the third and makes use of srv->alt
instead of !!mptcp below, for the protocol lookup:

>   if (srv) {
> - if (conn_prepare(srv_conn, 
> protocol_lookup(srv_conn->dst->ss_family, PROTO_TYPE_STREAM, 0), srv->xprt)) {
> + if (conn_prepare(srv_conn, 
> protocol_lookup(srv_conn->dst->ss_family, PROTO_TYPE_STREAM, !!mptcp), 
> srv->xprt)) {
(...)

> diff --git a/src/proto_tcp.c b/src/proto_tcp.c
> index d6552b2f1..9cabae11b 100644
> --- a/src/proto_tcp.c
> +++ b/src/proto_tcp.c
> @@ -149,6 +149,102 @@ struct protocol proto_tcpv6 = {
>  
>  INITCALL1(STG_REGISTER, protocol_register, &proto_tcpv6);
>  
> +#ifdef __linux__
> +/* Most fields are copied from proto_tcpv4 */
> +struct protocol proto_mptcpv4 = {
> + .name   = "mptcpv4",
(...)
> + .receivers  = LIST_HEAD_INIT(proto_mptcpv4.receivers),
> + .nb_receivers   = 0,

Thanks to the very latest changes in 3.1-dev6 that you rebased on, you
can drop these two lines, they're now initialized by protocol_register().
I figured I got trapped 3 times in less than one hour, it was too much!

> diff --git a/src/sock.c b/src/sock.c
> index df82c6ea7..e32573b7e 100644
> --- a/src/sock.c
> +++ b/src/sock.c
> @@ -278,7 +278,7 @@ int sock_create_server_socket(struct connection *conn, 
> struct proxy *be, int *st
>   ns = __objt_server(conn->target)->netns;
>   }
>  #endif
> - sock_fd = my_socketat(ns, conn->dst

Re: [PATCH 0/2] Add MPTCP to haproxy

2024-08-23 Thread Willy Tarreau
Hello!

On Fri, Aug 23, 2024 at 03:34:08PM +0200, Aperence wrote:
> Multipath TCP (MPTCP), standardized in RFC8684 [1], is a TCP extension
> that enables a TCP connection to use different paths.
> 
> Multipath TCP has been used for several use cases. On smartphones, MPTCP
> enables seamless handovers between cellular and Wi-Fi networks while
> preserving established connections. This use-case is what pushed Apple
> to use MPTCP since 2013 in multiple applications [2]. On dual-stack
> hosts, Multipath TCP enables the TCP connection to automatically use the
> best performing path, either IPv4 or IPv6. If one path fails, MPTCP
> automatically uses the other path.
> 
> To benefit from MPTCP, both the client and the server have to support
> it. Multipath TCP is a backward-compatible TCP extension that is enabled
> by default on recent Linux distributions (Debian, Ubuntu, Redhat, ...).
> Multipath TCP is included in the Linux kernel since version 5.6 [3]. To
> use it on Linux, an application must explicitly enable it when creating
> the socket. No need to change anything else in the application.
> 
> These patches are a continuation of the work [4] realized by Dorian 
> Craps and Matthieu Baerts.

It's great to see Dorian's work pursued, thank you for this series!

I'll comment on each patch separately, though I'll respond to the
question below:

>   - Patch 2:
> - Fix a warning about TCP_MAXSEG not being supported while MPTCP is 
> used by declaring new constants sock_inet(6)_mptcp_maxseg_default.
> Those constants represent the MSS supported by MPTCP (== -1 for the
> moment, as long as TCP_MAXSEG is not supported). For the moment, 
> MPTCP doesn't support TCP_MAXSEG socket option, mainly because it 
> has apparently never been requested before, apparently. It should 
> not be difficult to implement it, but is it an important option
> for HAProxy?

It's not that common anymore but 15 years ago when ADSL over PPPoE was
everywhere, it was super frequent to see users not able to upload large
POST data over TCP because their PC connected with a 1500 MTU to their
PPPoE modem/router was seeing its packets rejected due to the extra 8
bytes of PPPoE encapsulation. Many people had by then implemented the
set-mss rule in iptables on the server infrastructure to deal with this,
but high-bandwidth hosting providers couldn't even just place a firewall
in front of haproxy just to deal with that, so in this case it was very
convenient to permit haproxy to simply configure the socket to negotiate
a lower MSS compatible with everyone's setup.

Nowadays with everyone having a triple-play box provided by the ISP,
all this stuff is transparent to the user (when needed at all). However
I've been aware of users doing that as well to fixup trouble caused by
inter-company VPN links. You know, the type of places where everyone
says "it's right on my side, it must be the other side"...

So I guess that nowadays the mss setting mostly serves to fixup VPN
environments, and it's not much likely that we'll see that with MPTCP
soon I think (but I could of course be wrong).

With that said, from an implementation perspective, it would seem right
to make sure that most TCP tunables also work with MPTCP.

Thanks!
Willy




Re: minor patch to add environment variables for http and tcp clf log formats

2024-08-22 Thread Willy Tarreau
Hi Nathan,

On Tue, Aug 20, 2024 at 04:19:35PM +, Nathan Wehrman wrote:
> Since we just implemented the TCP CLF format I thought it would be a quick
> and easy time to add a couple of new environment variables.  It's a
> very small and simple patch but if a person needed this and it wasn't
> available then it's very frustrating so I thought I'd just get it done.

Thank you!

> If I missed
> anything then please let me know but I think it's just these 4 new lines.

There were three minor issues that I corrected:

> From 7cc48e41c54c6a59da62362f576397b46d09ec14 Mon Sep 17 00:00:00 2001
> From: Nathan Wehrman 
> Date: Tue, 20 Aug 2024 09:11:34 -0700
> Subject: MINOR: Created env variables for http and tcp clf formats

Usually we place a subsystem name here so that it's easier to figure
what this is about when reviewing long series of commits (typically
during backports or bisect sessions), it gives a bit of context and
usually helps the reader spot something without having to entirely
read hundreds of lines. There's nothing strict at all, you can see
stuff like "h2", "mworker" or "quic" for example. Usually what's purely
related to the config is tagged with "config" or "cfgparse" when it's
more specific to the parser. Here I'll add "config" since it's mostly
something that improves the config abilities.

> --- a/doc/configuration.txt
> +++ b/doc/configuration.txt
> @@ -927,6 +927,10 @@ file, or could be inherited by a program (See 3.7. 
> Programs):
>defined in section 8.2.3 "HTTP log format". It can be used to override the
>default log format without having to copy the whole original definition.
>  
> +* HAPROXY_HTTP_CLF_LOG_FMT: contains the value of the default HTTP CLF log 
> format as
 
^^
> +  defined in section 8.2.3 "HTTP log format". It can be used to override the
> +  default log format without having to copy the whole original definition.

For the doc we wap the lines at 80 chars so that they render correctly in
default window sizes. I've re-justified the 3 lines.

> --- a/src/haproxy.c
> +++ b/src/haproxy.c
> @@ -1188,8 +1188,10 @@ static int read_cfg(char *progname)
>* unset them after the list_for_each_entry loop.
>*/
>   setenv("HAPROXY_HTTP_LOG_FMT", default_http_log_format, 1);
> - setenv("HAPROXY_HTTPS_LOG_FMT", default_https_log_format, 1);
> + setenv("HAPROXY_HTTP_CLF_LOG_FMT", clf_http_log_format, 1);
> +setenv("HAPROXY_HTTPS_LOG_FMT", default_https_log_format, 1);
>   setenv("HAPROXY_TCP_LOG_FMT", default_tcp_log_format, 1);
> + setenv("HAPROXY_TCP_CLF_LOG_FMT", clf_tcp_log_format, 1);

You seem to be having an issue with your editor's tab handling, because
there was a similar one in your previous patch. The line about
"HAPROXY_HTTPS_LOG_FMT" had its leading tab replaced with 4 spaces. I
suspect you're viewing with a tab size of 4 and maybe you manually fixed
a mistake on the wrong line and the editor turned your tabs to spaces.

It's worth checking the output of "git diff" to avoid such things. Also
if you add:

   [color]
ui = true

To your global git config (e.g. "git config color.ui true" or
"git config --global color.ui true"), git diff will then also highlight
trailing spaces/tabs and such annoying changes on invisible chars.

Now applied, thank you!
willy




[ANNOUNCE] haproxy-3.1-dev6

2024-08-21 Thread Willy Tarreau
   MINOR: cfgparse-global: move 'expose-*' in global keywords list
  MINOR: cfgparse-global: move tune options in global keywords list
  MINOR: cfgparse-global: move unsupported keywords in global list
  BUG/MINOR: cfgparse-global: remove tune.fast-forward from common_kw_list

William Lallemand (8):
  MINOR: channel: implement ci_insert() function
  BUG/MEDIUM: mworker/cli: fix pipelined modes on master CLI
  REGTESTS: mcli: test the pipelined commands on master CLI
  CLEANUP: mworker/cli: clean up the mode handling
  BUG/MINOR: release-estimator: fix relative scheme in CHANGELOG URL
  MINOR: release-estimator: add requirements.txt
  MINOR: release-estimator: add installation steps in README.md
  MINOR: release-estimator: fix the shebang of the python script

Willy Tarreau (10):
  BUG/MINOR: tools: make fgets_from_mem() stop at the end of the input
  MINOR: quic: store the lost packets counter in the quic_cc_event element
  MINOR: quic: support a tolerance for spurious losses
  MINOR: protocol: properly assign the sock_domain and sock_family
  MINOR: protocol: add a family lookup
  MEDIUM: socket: always properly use the sock_domain for requested families
  MINOR: protocol: add the real address family to the protocol
  MINOR: socket: don't ban all custom families from reuseport
  MINOR: protocol: always initialize the receivers list on registration
  CLEANUP: protocol: no longer initialize .receivers nor .nb_receivers

---




Re: [PATCH] BUG/MINOR: stats-html: improve markup and fix css in dark mode

2024-08-20 Thread Willy Tarreau
On Tue, Aug 20, 2024 at 03:29:04PM +0200, Nicolas CARPi wrote:
> Hi Willy,
> 
> On 20 Aug, Willy Tarreau wrote:
> > Normally it's preferable to make one commit per functional change. 
> Noted. Here are three patches then. I've taken care of explaining the 
> reasoning behind each of these changes in the commit messages, which 
> should answer your interrogations! :)

Perfect now, all applied, thank you!

Willy




Re: [PATCH] BUG/MINOR: stats-html: improve markup and fix css in dark mode

2024-08-20 Thread Willy Tarreau
Hi Nicolas,

On Tue, Aug 20, 2024 at 11:08:33AM +0200, Nicolas CARPi wrote:
> From 96f5e7951995be8216ecee81968b0f2c7fe0141a Mon Sep 17 00:00:00 2001
> From: Nicolas CARPi 
> Date: Tue, 20 Aug 2024 10:39:17 +0200
> Subject: [PATCH] BUG/MINOR: stats-html: improve markup and fix css in dark
>  mode
> 
> This patch concerns the HTML stats page:
> * Update the Doctype to a modern one
> * Add the "lang" attribute to the "html" tag. This is required by WCAG
>   Success Criterion 3.1.1 (accessibility standard)
> * Fix the text color in dark mode for the Scope input field

Normally it's preferable to make one commit per functional change. Even
if these ones touch the same thing, they seem to change totally unrelated
stuff (doctype, lang, color preference).

Also what's the motivation/impacts of the first one? Does it render
better or avoid a warning on some clients, is it just because it's cleaner,
etc ? In this case maybe that one should just be tagged cleanup and the
other ones minor. That's low importance, but I prefer that we don't start
mixing unrelated changes in a single commit (and some might argue for
backporting some of them, e.g. the color or lang).

Thanks!
Willy




Re: New option tcplog clf format

2024-08-19 Thread Willy Tarreau
Hi Nathan,

On Tue, Aug 13, 2024 at 06:34:00PM +, Nathan Wehrman wrote:
> 
> Hello,
> 
> For your consideration I wrote and tested a new a logging format that will
> allow tcp mode proxies to send logs that will adhere to clf (common log
> format).
> This will allow the sending of that data to servers that expect that format
> so it can be parsed but the connections will stand out as TCP.
> To accomplish that I used this log-format string:
> 
> log-format "%{+Q}o %{-Q}ci - - [%T] \"TCP \" 000 %B \"\" \"\" %cp %ms %ft %b
> %s %Th %Tw %Tc %Tt %U %ts-- %ac %fc %bc %sc %rc %sq %bq \"\" \"\" "
> 
> The HTTP request has been replaced with "TCP " and the response code has
> been hard coded to "000".
> 
> I also added a couple of very minor spacing changes.
> 
> The patch file is attached.

Looks good overall, and even if I don't use CLF myself, I obviously have
no objection to this minor change. I fixed two tiny ident issues (one on
a variable declaration and one in a condition) and changed the tag from
MEDIUM to MINOR since it really has no theoretical impact on existing
features, and I've applied it.

I've been wondering why we don't have the CLF equivalent variables for
tcp/http logs (e.g. HAPROXY_HTTP_LOG_FMT), and I suspect that it's simply
because they're less often used. I don't know if there could be any
interest in having the equivalent ones declared for CLF (http/tcp) or
not. Since this is only pure convenience, it's always difficult to guess
whether some users would really benefit from this.

Thanks!
Willy




Re: [PATCH] DOC: fix incorrect english in lua.txt

2024-08-19 Thread Willy Tarreau
Hi Nicolas,

On Wed, Aug 14, 2024 at 11:16:31AM +0200, Nicolas CARPi wrote:
> From f9cff910630851658a9f126caf1009e08dec Mon Sep 17 00:00:00 2001
> From: Nicolas CARPi 
> Date: Tue, 13 Aug 2024 22:57:56 +0200
> Subject: [PATCH] DOC: fix incorrect english in lua.txt
> 
> This commit fixes some typos, grammatical errors and unusual english
> such as "can not" instead of preferred "cannot".

Interesting, we had a fix a few years ago doing exactly the opposite,
changing "cannot" to "can not" which was supposed to be better English.
With that said we have more "cannot" than "can not" (and I personally
prefer that one for being more common).

I'm applying your patch, thank you!
Willy




Re: [PATCH 1/3] CI: QUIC Interop LibreSSL: document chacha20 test status

2024-08-19 Thread Willy Tarreau
On Tue, Aug 13, 2024 at 09:11:28PM +0200, Ilia Shipitsin wrote:
> due to https://github.com/haproxy/haproxy/issues/2569 chacha20 is
> disabled completely on LibreSSL. let's add a comment to not forget
> enabling it
(...)

series applied, thanks Ilya!

Willy




Re: Tuning HTTP/2 window size

2024-08-14 Thread Willy Tarreau
Hi Max,

On Wed, Aug 14, 2024 at 06:21:39AM +, Moehl, Maximilian wrote:
> Hi Willy,
> 
> > > Is there a similar mechanism in HAProxy? So far I can only see the
> > > static option for the initial window size which comes with the mentioned
> > > drawbacks.
> >
> > There is nothing similar. One of the problems H2 is facing is that there
> > can be application congestion anywhere in the chain. For example, let's
> > say a POST request is sent to a server subject to a maxconn and remains
> > in the queue for a few seconds. We definitely don't want to block the
> > whole connection during this time because we've oversized the window.
> >
> > And there's also the problem of not allocating too many buffers to each
> > stream, or it becomes a trivial DoS.
> 
> That makes sense, since we had to accommodate the use-case which triggered the
> investigation we increased the window size to 512k as a middle-ground. We 
> don't
> usually see congestion on our HAProxies so we are hoping that this does not
> cause any other issues, we'll see once it hits prod.

There used to be a case where it was causing significant problems: initially
this initial window size was both for frontend and backends. If you were using
H2 on the backend as well, and two requests from different clients were sent
over the same H2 backend connection, a slow reader would be sufficient to
cause the other one to experience read timeouts. Now we've addressed this
using two points:
  - it's possible to configure the backend-side window size separately
from the frontend-side
  - "http-reuse safe" (the default mode) makes H2 backend connections
private, that is, they're not shared between multiple client
connections. This results in more connections on the backend side
but the progress on the backend matches the progress on the frontend.

The remaining issues that you may encounter with large frontend windows
is if a server sometimes pauses when consuming POST bodies, it may
occasionally prevent the client from performing requests in parallel
over the same connection. That's quite rare to meet such trouble of
course, but we want to stay on the safe side by default, which is why
we don't use a larger window by default. If you know that your servers
are "normal" and consume the bodies sent to them, there's no real
problem with increasing the window size.

A case I met a long time ago that could match a pathologic one was a
few URLs served by a partner's server behind a slow leased line. In
this case, a large POST sent over this link could occasionally face
congestion and could temporarily freeze the client connection for
the time it took to upload a large body. As you see it's not everyone
that hosts such applications.

Hoping this helps,
Willy




Re: Tuning HTTP/2 window size

2024-08-13 Thread Willy Tarreau
Hi Maximilian,

sorry for the delay, I'm sure I noticed the subject but likely archived
the message before reading it. Thanks to Lukas for pinging again about
it in GH issue #352 which was already about this!

On Tue, Jul 23, 2024 at 08:38:44AM +, Moehl, Maximilian wrote:
> We've received reports from users that are struggling to upload large
> files via our HAProxies. With some testing, we discovered that h1 is
> relatively stable as latency increases while h2 constantly degrades.
> We've narrowed the issue down to `tune.h2.fe.initial-window-size` which
> states:
> 
> > The default value of 65536 allows up to 5 Mbps of bandwidth per client
> > over a 100 ms ping time, and 500 Mbps for 1 ms ping time.
> 
> Which is in line with our experiments.
> 
> While researching the topic, I've come across a blog post from
> Cloudflare [1] where they mention improvements to their edge which
> dynamically adjusts the window size to accommodate the latency and
> bandwidth constraints. They've upstreamed the patch to nginx [2]
> (although I must admit that I'm not sure whether the change is fixing a
> nginx specific issue or the general issue we are experiencing here).
> 
> Is there a similar mechanism in HAProxy? So far I can only see the
> static option for the initial window size which comes with the mentioned
> drawbacks.

There is nothing similar. One of the problems H2 is facing is that there
can be application congestion anywhere in the chain. For example, let's
say a POST request is sent to a server subject to a maxconn and remains
in the queue for a few seconds. We definitely don't want to block the
whole connection during this time because we've oversized the window.

And there's also the problem of not allocating too many buffers to each
stream, or it becomes a trivial DoS.

However, as I mentioned somewhere (maybe the issue above, I don't remember),
I think that the vast majority of users are not downloading and uploading at
the same time over a same connection. Either they're uploading a large file
or downloading contents to be rendered. Maybe in this case we could consider
that the allocatable buffers for the mux could in fact be repurposed for any
stream. We already have one per-stream buffer. If the stream getting the
POST was able to allocate more buffers, it could then release the connection
from pending frames and allow the client to use a larger send window. One of
the difficulties is to figure how to allocate them but we can afford a few
round trips during a large upload, so actually the stream itself could
increase the window when allocating new buffers. In our case, buffers were
really not meant to be extensible on the rx side, but I suspect it might
not be too hard to do.

I'll also need to have a look at what Cloudflare did for NGINX, they might
have updated that based on their observations and corner cases.

That's something to think about. Thanks for re-heating this old topic!

Willy




Re: minor correction to the configuration manual

2024-08-13 Thread Willy Tarreau
Hi Nathan,

On Tue, Aug 13, 2024 at 05:45:37PM +, Nathan Wehrman wrote:
> The configuration manual currently lists "option tcplog" as valid for use in
> a backend.
> This is not correct. This patch simply fixes that one line.

Thank you, you're right, now merged.

Please check your git setup, I think you have not set the "user.name"
variable and your name didn't appear in the From field, but only the
e-mail:

   From: nwehrman 
   Date: Tue, 13 Aug 2024 10:36:28 -0700
   Subject: MINOR correct the table for option tcplog

Normally this is done with:

   git config --global user.name "Foo Bar"

I've fixed it in the patch and merged it.

Thanks,
Willy




Re: [PATCH] FEATURE/MAJOR: Add upstream-proxy-tunnel feature

2024-08-12 Thread Willy Tarreau
Hi Alex,

On Mon, Aug 12, 2024 at 11:46:37AM +0200, Aleksandar Lazic wrote:
> > On Thu, Jun 13, 2024 at 03:00:59AM +0200, Aleksandar Lazic wrote:
> > > > The final idea is something like this.
> > > > 
> > > > ```
> > > > tcp-request content upstream-proxy-header Host %[req.ssl_sni,lower]
> > > > tcp-request content upstream-proxy-header "$AUTH" "$TOKEN"
> > > > tcp-request content upstream-proxy-header Proxy-Connection Keep-Alive
> > > > tcp-request content upstream-proxy-target %[req.ssl_sni,lower]
> > > > 
> > > > server stream 0.0.0.0:0 upstream-proxy-tunnel
> > > > %[req.ssl_sni,lower,map_str(targets.map)]
> > > > ```
> > > > 
> > > > The targets.map should have something like this.
> > > > #dest proxy
> > > > sni01 proxy01
> > > > sni02 proxy02
> > > > 
> > > > I hope the background of upstream-proxy-target is now more clear.
> > > > 
> > > > > - In `server https_Via_Proxy1 0.0.0.0:0 upstream-proxy-tunnel 
> > > > > 127.0.0.1:3128`
> > > > >     from your config, what is 0.0.0.0:0 used for here? This binds to 
> > > > > all IPv4
> > > > >     but on a random free port?
> > > > 
> > > > This is required when the destination should be dynamic, it's 
> > > > documented here.
> > > > http://docs.haproxy.org/3.0/configuration.html#4.4-do-resolve
> > > > 
> > > > > A+
> > > > > Dave
> > > > 
> > > > Regards
> > > > Alex
> > 
> > Overall I find that there are plenty of good points in this patch and
> > the discussion around it. But it also raises questions:
> > 
> >- IMHO the server's address should definitely be the proxy's address.
> >  This will permit to use all the standard mechanisms we have such as
> >  DNS resolution and health checks and continues to work as a regular
> >  handshake. I'm not sure this is the case in the current state of the
> >  patch since I'm seeing an extra "upstream-proxy-tunnel" parameter
> >  (actually I'm confused by what you call a "target" here, as used in
> >  upstream-proxy-target).
> 
> Well I thought the same but that's then difficult to separate between the
> proxy server and the destination server. Maybe we need some kind of server
> type?
> 
> As you can see in the sequence diagram are both servers from HAProxy point of 
> view.
> https://github.com/git001/tls-proxy-tunnel/blob/main/README.md#sequence-diagram
> 
> The "target" is the "destination server" which is from my point of view the
> "server" in haproxy.
> 
> The "upstream-proxy-target" is the upstream proxy server. I'm also not happy
> with that name and open for suggestions.

Then I'm confused by what the server's configured address becomes.

I mean, there are three possibilities:

  1) the ip:port of the server designates the target server, in which
 case only ip:port are passed to the proxy and we never send an FQDN
 there. Then the proxy has to be hard-coded in an option (apparently
 the upstream-proxy-tunnel here). Haproxy then focuses only on the
 target server for load balancing, stickiness, health checks etc and
 always reaches that server through the hard-coded upstream proxy.
 This could match what an application-oriente LB would look for, i.e.
 being able to reach its "local" servers through a proxy. It's just
 that I would be surprised that such a use case really exists, because
 load-balancing remote servers across a single local hard-coded proxy
 doesn't feel natural at all. E.g. the LB is not even on the LAN that
 knows all the remote hosts so such a config is particularly difficult
 to maintain in a working state.

  2) the fqdn:port of the server designates the target server, in which
 case only fqdn:port are passed to the proxy and we let the proxy
 resolve the target server. The rest is the same as above, it's just
 a small variation, but I'm still having a hard time figuring what
 use case that could match.

  3) the ip:port of the server designates the upstream proxy, and the
 connection's destination is passed to the CONNECT method. In this
 case it means that LB, checks, stickiness etc applies to the proxies
 and not to the servers. In this case we need a distinct field to set
 the target host name. That's more or less what you did with the
 tcp-request upstream-proxy-target rule, which could possible have a
 default on the server line. We could also imagine that a set-dst
 action would allow to force the target destination address, but that
 would be a detail. This more is more about infrastructure than remote
 app servers: you tell the proxy how to reach external services. I've
 seen this requested a few times by those saying "I would like to use
 the proxy protocol from my servers, that would connect to haproxy
 which would then connect to the forward proxy to reach the
 destination". The choice might be debatable of course but I think it
 does cover a number of situations where haproxy and the forward proxy
 farm are here to provide

Re: [PATCH] FEATURE/MAJOR: Add upstream-proxy-tunnel feature

2024-08-12 Thread Willy Tarreau
Hi Alex,

I finally found time to have a look into this!

On Thu, Jun 13, 2024 at 03:00:59AM +0200, Aleksandar Lazic wrote:
> > The final idea is something like this.
> > 
> > ```
> > tcp-request content upstream-proxy-header Host %[req.ssl_sni,lower]
> > tcp-request content upstream-proxy-header "$AUTH" "$TOKEN"
> > tcp-request content upstream-proxy-header Proxy-Connection Keep-Alive
> > tcp-request content upstream-proxy-target %[req.ssl_sni,lower]
> > 
> > server stream 0.0.0.0:0 upstream-proxy-tunnel
> > %[req.ssl_sni,lower,map_str(targets.map)]
> > ```
> > 
> > The targets.map should have something like this.
> > #dest proxy
> > sni01 proxy01
> > sni02 proxy02
> > 
> > I hope the background of upstream-proxy-target is now more clear.
> > 
> > > - In `server https_Via_Proxy1 0.0.0.0:0 upstream-proxy-tunnel 
> > > 127.0.0.1:3128`
> > >    from your config, what is 0.0.0.0:0 used for here? This binds to all 
> > > IPv4
> > >    but on a random free port?
> > 
> > This is required when the destination should be dynamic, it's documented 
> > here.
> > http://docs.haproxy.org/3.0/configuration.html#4.4-do-resolve
> > 
> > > A+
> > > Dave
> > 
> > Regards
> > Alex

Overall I find that there are plenty of good points in this patch and
the discussion around it. But it also raises questions:

  - IMHO the server's address should definitely be the proxy's address.
This will permit to use all the standard mechanisms we have such as
DNS resolution and health checks and continues to work as a regular
handshake. I'm not sure this is the case in the current state of the
patch since I'm seeing an extra "upstream-proxy-tunnel" parameter
(actually I'm confused by what you call a "target" here, as used in
upstream-proxy-target).

  - for the remote server's name, typically the one we'd extract from
a Host header usually or from SNI in some cases (maybe that's what
you called the "target" ?), we'd definitely need to support an
expression sent as text (usually ip:port), though we could as well
find it convenient to set the host and the port separately. In this
case it's likely that the sole presence of this expression could be
sufficient to enable the feature.

  - the Host field should default to the one in the URI if there's none
explicitly added.

  - I like the fact that you implemented support for headers upfront,
as we all know that this will be a prerequisite for many users.

  - you should definitely get rid of allthese DPRINTF() debug traces.
The simple fact that I no longer know how to enable them should ring
a bell about their relevance ;-)

  - we need to make sure this handshake is sent before SSL and not after.
This should permit to do what users normally expect, i.e., encrypt a
TCP connection and encapsulate it into a CONNECT request.

  - please don't realign the srv_kw_list at the end of the patch, it makes
it impossible to spot the addition(s) there. We can tolerate a few
unaligned entries (even though often it's a sign that a shorter name
would be welcome), and if you really need to realign, please do that
in a separate patch.

  - I think the headers addition should also be done into a separate patch
since it's a significant part of the patch. It will also enlighten the
default behavior since the code must work without extra header rules.

  - and BTW there's no need for this "upt" entry in the proxy struct, if
a string is needed you can just use an ist or even char* which are
simpler. But I do think that it should be extracted from the request
(something we need to figure how by default).

Thanks!
Willy




Re: [RFC] Allow disabling abstract unix socket paths NUL-padding

2024-08-09 Thread Willy Tarreau
Hi Tristan,

I'm back on this topic (I had not forgotten it).

On Sat, Mar 09, 2024 at 07:02:34PM +, Tristan wrote:
> 
> 
> On 09/03/2024 18:09, Tristan wrote:
> > Hi Willy,
> > 
> > On 09/03/2024 16:51, Willy Tarreau wrote:
> > > Hi Tristan,
> > > 
> > > On Sat, Mar 09, 2024 at 04:20:21PM +, Tristan wrote:
> > > > To be honest, I don't think this is unfixable. It's just a matter of how
> > > > much code change we think is acceptable for it.
> > > 
> > > I don't mind about the amount of changes. "we've always done it like
> > > this"
> > > is never a valid excuse to keep code as it is, especially when it's wrong
> > > and causes difficulties to add new features. Of course I prefer invasive
> > > changes early in the dev cycle than late but we're still OK and I don't
> > > expect that many changes for this anyway.
> > 
> > Agreed on not shying away from amount of changes.
> > 
> > I tried to follow that route, being generous in what I was willing to
> > touch, but ran into the issue of protocols and socket families being a
> > 1:1 binding.
> > 
> > So adding a new socket family *requires* adding a new protocol too,
> > which isn't too hard but spirals a bit out of control:
> > - protocol_by_family is a misnomer, and it is actually protocols *by
> > socket domain* (which is still AF_UNIX, even with family AF_CUST_ABNS2,
> > and ends up overriding standard socket/abns protocol)
> > - but you can't just change that, because then other things stop working
> > as some places rely on socket domain != socket family
> 
> Actually, following up on this, I'm more and more convinced that it actually
> makes things somewhat worse to take this approach...
> 
> At least I don't see a good way to make it work without looking very bad,
> because in a few places we lookup a protocol based on a socket_storage
> reference, so we have truly only the socket domain to work with if no
> connection is established yet.
> 
> While it sounds like shying away from deeper changes, it seems to me that a
> bind/server option is truly the right way to go here.

I addressed these shortcomings of protocol lookups in a branch here:

  https://github.com/haproxy/haproxy/tree/20240809-abnsz-4

There was an API bug dating from 2.3 that was making everything more
complicated than necessary, the code was assigning the custom family
to sock_domain (the socket() argument) instead of sock_family (the
one we're supposed to store into the address itself). I fixed that
and added the minimal needed logic to properly perform the lookups
and also resolve custom families into their real ones, because
previously we had to perform random protocol lookups to figure them
(we didn't notice since other custom families don't map to real ones).
It even simplified a few places where some values were hard-coded in
certain calls, or some logic enforced late (e.g. log.c).

I now created abns as a custom proto, separate from uxst, so that it
has its own family. It was not strictly necessary but it will allow
us to get rid of plenty of \0 tests at many places by just having
functions dedicated to unix or abns. Even the addrcmp() function
should become simpler and I thought I'd specialize them.

I've just added a last patch to create abnsz (the zero-terminated
abns, since William rightfully suggested that the codename abns2
could make one think it's version 2), but for now it's a perfect
copy of the other one, I have not modified the address length checks.
I remember that you had a working PoC so I guess you've already spotted
all locations and what to change, so restarting from your patch set
will be much faster now.

Please let me know if you want to give it a try, or if you think you
won't have time to look into it and you prefer me to try to merge
your work into this, it's as you prefer.

I feel like we're finally seeing the light at the end of this long
tunnel!

I'm done for today and probably for the week-end as well I think.

Have a nice week-end!
Wily




Re: Opinions desired on dropping support for duplicate names

2024-08-09 Thread Willy Tarreau
Hi Tristan,

On Fri, Aug 09, 2024 at 03:51:24PM +, Tristan wrote:
> Hi Willy,
> 
> > On 9 Aug 2024, at 16:26, Willy Tarreau  wrote:
> > [...]
> > I'd be interested in opinions on some of these options:
> >  - deprecate duplicate server names for 3.1, requiring a global option
> >to support them and drop their support for 3.3 ?
> >  - just drop the support of duplicate server names right now in 3.1 ?
> >  - keep these working "forever" and reconsider the option later, due to
> >a very valid case that I'm ignoring ?
> 
> If I'd realized, I'd have assumed I found a bug I think; wonder how that even
> gets handled by things like the prom exporter which key purely by name?
> 
> +1 for drop in 3.1
> 
> >  - for backends + frontends, deprecate same names in 3.1, drop in 3.3,
> >or drop now ? Or keep them ?
> 
> For a pure proxy case without any load balancing (1 fe, 1 be, 1 server) I can
> see it being a bit heavy to have to name both, likely xyz-frontend and
> xyz-backend or something.  Is that justification enough? I'm not convinced.

That was the initial reason for that. But how many configs are using a
frontend and a backend with similar names ? Mystery. Even the stats are
confusing in such a case.

> Neutral.
> 
> wrt the stick tables example, the real crime is that naming them explicitly
> isn't mandatory, at both declaration and usage time... I'm not sure what the
> scope rules are for them, but I promptly decided it was not desirable
> knowledge when I started using them...

I contemplated the possibility of adding an optional "name" argument to
the "stick-table" directive a while ago. And I figured that it would make
it almost mandatory to totally decorellate the tables from the proxies.
These have already been split in the past since it's possible to declare
them in peers section, but that causes a configuration issue: if you merge
config fragments from various customers and they're allowed to name their
table as another customer's proxy, it becomes a bit annoying. It also
means having a same namespace for tables and backends, which reproduces
the issue we currently have with fronts and backs. Nothing unsolvable,
of course, but it definitely needs some thinking before doing a mistake.
BTW, for peers, the table's name contains the peers section name prepended
to it, e.g. "mypeers/t1". That might be something to do with fronts/backs
as well after all, which may also finally permit to support multiple tables
per proxy and avoid those scary configs with hundreds of empty backends
that just carry one stick-table each.

> > [...] We
> > could even consider backporting a warning for such cases in 3.0 so as
> > to limit the surprise if we were to drop that right now.
> 
> +1

Thanks for sharing your views on this!

Willy




Opinions desired on dropping support for duplicate names

2024-08-09 Thread Willy Tarreau
Hi all,

I'm continuing to find disgusting things in the code that are only here
for historical reasons which, in my opinion, should no longer exist.

For example, we still support duplicate server names if they have an
explicit ID. Example:

 # this is rejected
 backend b
server s 127.0.0.1
server s 127.0.0.2

 # this is accepted
 backend b
server s 127.0.0.1 id 1
server s 127.0.0.2 id 2

The check to block the first one runs in O(N^2) at boot, though I
could turn it to O(N*log(N)) but I'm wondering if that's worth it.
There's even special code in "use-server" rules to deal with that, it
carefully scans all servers matching the name in their declaration
order, and checks for their state so that it returns the first working
one. TBH, I strongly doubt anyone relies on this, and I'd be tempted by
deprecating this feature first, then killing it after a few versions.

Another example: frontend/backend/listen name duplicate detection. It's
a bit awkward and actually bogus. The principle was to say that if a
proxy has a frontend capability (hence "frontend" or "listen"), then
we check for duplicate frontend names, and that if a proxy has a backend
capability (hence "backend" or "listen"), then we check for duplicate
backend names. It was incorrectly implemented as:

if the proxy has frontend capability, then
check_for_frontend_duplicate()
otherwise
check_for_backend_duplicate()

This results in this config to be incorrectly accepted:

   backend b

   listen b
bind :

Indeed, the second "b" has frontend capability, thus we search for another
frontend, there is none so that's OK. It definitely is an accident but it
perfectly illustrates the absurdity of the current situation.

At the very least I would like to fix that one but quite frankly I'm a bit
fed up with proxies wearing same names. Did you know that because of this,
there is a special check for stick-table name conflicts so as to reject
this one:

   frontend b
bind :
stick-table type ip size 10

   backend b
stick-table type ip size 10

I strongly doubt that by now there are still configs relying on such
special cases. I very well remember why we allow frontends and backends
with the same names, it was in hope to encourage everyone to split their
"listen" sections in two without even renaming them so that we could get
rid of "listen". In practice this did not happen, "listen" remained quite
popular for simple configs which have no business dealing with content
switching rules (typically pure TCP proxies), and we've been carrying
this thing for about two decades.

Now we're at an era where many configs are generated in ways that cannot
even produce such corner cases, they're harder to follow and debug, and
I'm wondering what's the point of preserving support for such cases which
I'm intimately convinced that nobody relies on.

I'd be interested in opinions on some of these options:
  - deprecate duplicate server names for 3.1, requiring a global option
to support them and drop their support for 3.3 ?
  - just drop the support of duplicate server names right now in 3.1 ?
  - keep these working "forever" and reconsider the option later, due to
a very valid case that I'm ignoring ?
  - for backends + frontends, deprecate same names in 3.1, drop in 3.3,
or drop now ? Or keep them ?

As you know I'm not in favor of removing stuff without prior warnings,
but these ones have become so absurd over time that sometimes I really
doubt anyone will ever run the code that deals with the error and the
code that parses the new option to force supporting the mechanism. We
could even consider backporting a warning for such cases in 3.0 so as
to limit the surprise if we were to drop that right now.

And I don't want to scare anyone, if there are valid cases, that's fine
as well and then I just want that we properly support them. It's just
that I feel that fixing the corner cases and costs above is pointless,
and without a use case I'm not tempted by investing time there.

If you're aware of other types of stupid duplicate identifiers that we
continue to support for historical reasons and that would make sense to
be simplified, please suggest!

Thanks in advance for sharing your opinion on this topic.
Willy

PS: let's also Cc Marko for the dataplane API since it should have impacts
on it, though very likely it will simplify it, not make it more complex.

PPS: no emergency if you're on holiday, we can decide in a month as well.

PPPS: please no delirium about how we should take this opportunity to
  revist all the config language ;-)




Re: [PATCH] FEATURE: Adding MPTCP with option to disable it and fall-back to TCP

2024-08-09 Thread Willy Tarreau
Hi Matthieu,

On Fri, Aug 09, 2024 at 12:52:04PM +0200, Matthieu Baerts wrote:
> On 09/08/2024 11:32, Willy Tarreau wrote:
> > On Mon, May 06, 2024 at 02:10:02PM +0200, Björn Jacke wrote:
> >> Hi,
> >>
> >> I came up a while ago with a patchset for MPTCP support for HAProxy also,
> >> see https://github.com/haproxy/haproxy/issues/1028
> >>
> >> Back then I also discussed some ideas with Willy how to implement it more
> >> elegantly in HAProxy. It's still somewhere on my todo list but 
> >> unfortunately
> >> I didn't catch that up since then. As I had a good real-world usecase
> >> recently for MPTCP I though of looking into this again also.
> > 
> > I had a look at this series, and am seeing that the largest part of the
> > changes is to use ->sock_prot (either IPPROTO_TCP or IPPROTO_MPTCP) in
> > getsockopt() and setsockopt() calls, which is not in Dorian's patchset.
> > 
> > Thus I'm just wondering if this is mandatory or if there's a compatibility
> > mode in the kernel so that IPPROTO_TCP also works for MPTCP sockets. Maybe
> > Matthieu knows better and could enlighten us on this ?
> 
> There is no need to change anything in the getsockopt() and setsockopt()
> calls: it is still possible to use SOL_TCP (== IPPROTO_TCP). The kernel
> will try to adapt the socket option to the MPTCP case, e.g. applying
> something on all the subflows, or just the first one, or the MPTCP
> socket, depending on the option.
> 
> There are some socket options under SOL_MPTCP (284), but there are
> "new", and specific to MPTCP, e.g. to get info about the different paths
> being used, etc.
> 
> If you replace IPPROTO_TCP in the getsockopt() and setsockopt() calls by
> IPPROTO_MPTCP (262), it means you try to look at socket options with the
> SOL_X25 level, that's not what we want here ;)

OK, thank you for the very detailed explanation!

> > Because if it's required, we definitely need to apply similar changes
> > everywhere (the code has probably evolved a little bit since 2021), and
> > if not, it could mean that only the part that performs the protocol
> > registration (that Matthieu+Dorian also handled) and the doc should
> > suffice. I think we have everything in shape now to decide what's left
> > to do in order to get the feature merged. I hope this time we won't
> > miss the deadline!
> 
> In theory, the last patch I sent should be enough.
> 
> BTW, thank you for your other reply, no issues for the delay. I will try
> to look at the modifications you requested when I can find some time :)

That works for me. We'll release by end of november, with a hard freeze
around a month earlier, so there's no rush, though it always helps to
get feedback from users, of course!

If you'd figure that you can't find the time to complete it in a reasonable
delay, do not hesitate to ping me again about it to let me know, we'll find
how to arrange this together.

Also, I think it would be kind to leave a mention about Björn's work in
your commit message, who attempted the very first implementation 3-4
years ago when haproxy's internals were probably less ready to deal with
this, causing his work to be left pending for a while.

Thanks!
Willy




Re: [PATCH] FEATURE: Adding MPTCP with option to disable it and fall-back to TCP

2024-08-09 Thread Willy Tarreau
Hi Björn,

I'm coming back to this:

On Mon, May 06, 2024 at 02:10:02PM +0200, Björn Jacke wrote:
> Hi,
> 
> I came up a while ago with a patchset for MPTCP support for HAProxy also,
> see https://github.com/haproxy/haproxy/issues/1028
> 
> Back then I also discussed some ideas with Willy how to implement it more
> elegantly in HAProxy. It's still somewhere on my todo list but unfortunately
> I didn't catch that up since then. As I had a good real-world usecase
> recently for MPTCP I though of looking into this again also.

I had a look at this series, and am seeing that the largest part of the
changes is to use ->sock_prot (either IPPROTO_TCP or IPPROTO_MPTCP) in
getsockopt() and setsockopt() calls, which is not in Dorian's patchset.

Thus I'm just wondering if this is mandatory or if there's a compatibility
mode in the kernel so that IPPROTO_TCP also works for MPTCP sockets. Maybe
Matthieu knows better and could enlighten us on this ?

Because if it's required, we definitely need to apply similar changes
everywhere (the code has probably evolved a little bit since 2021), and
if not, it could mean that only the part that performs the protocol
registration (that Matthieu+Dorian also handled) and the doc should
suffice. I think we have everything in shape now to decide what's left
to do in order to get the feature merged. I hope this time we won't
miss the deadline!

Thanks,
Willy




Re: [PATCH] FEATURE: Adding MPTCP with option to disable it and fall-back to TCP

2024-08-08 Thread Willy Tarreau
Hi Matthieu,

first, sorry for the long delay, but each time it's the same, the list
of pending urgent things drags me away. I'm back on this.

On Mon, Jun 03, 2024 at 05:33:31PM +0200, Matthieu Baerts wrote:
> >>> and I'd really really prefer that we use the extended syntax for
> >>> addresses that offers a lot of flexibility. We can definitely have
> >>> "mptcp@address" and probably mptcp as a variant of tcp.
> >>
> >> >From what I understood, Dorian is now looking at that. It is not clear
> >> if he should create a new proto (struct protocol) or modifying it after
> >> having called protocol_lookup() in str2sa_range().
> > 
> > I *tend* to think that a new struct protocol is easier to add and
> > maintain. It could just share most (if not all) of the functions
> > with tcp, and probably be declared in the same file for ease of
> > sharing code.
> > 
> > I'm seeing that in the comment you linked above I also proposed a
> > keyword for "bind" and "server" lines, like we have "tfo" to enable
> > TCP fast-open. It tends to be slightly easier to implement but then
> > requires more care everywhere because such options do no apply to
> > all protocols, so if you have "mptcp on" on a "bind quic4@:443" line,
> > that's quite confusing so it deserves extra checks to make sure it
> > is not silently ignored. Also I do have some doubts about how to
> > retrieve the source address, maybe we'll find that in fact it should
> > be seen as a new address family and not a new transport layer. But I
> > think not at this point. And BTW Björn had apparently implemented a
> > solution based on mptcp@ as well so it's likely workable.
> 
> Yes, it looks better with a new 'struct protocol'.
> 
> I worked on a first draft a few weeks ago:
> 
>   https://github.com/matttbe/haproxy/commit/f48f36191
> 
> As I mentioned in a previous email, I can wait for you to be back to
> send a new version on the mailing list.

That looks pretty interesting and clean like this! I find your approach
of duplicating the tcp definition interesting, but I'd still prefer to
have it pre-defined rather than duplicated from tcp though. It's the only
one being done like this and I'm concerned that sooner or later we'll
break it the dirty way by checking that a list element is empty, that a
pointer is NULL or whatever. It's no big deal to have duplicated
definitions like this, and it eases "git grep" to find all affected
protocols when we want to touch a function.

I'm just wondering if the "alt" argument (that you renamed from ctrl_dgram)
is always exclusive with the mptcp one. It looks so at first glance. In
the worst case this would just end up with a bit field instead of just a
boolean, like many other functions.

But yes, overall I think that the approach looks like the correct one,
and it offers quite a lot of flexibility that way. Out of curiosity,
did you test it on the backend side ? I don't know what it implies to
do mptcp on the backend (if anything at all), it's more to make sure
we're not overlooking anything.

> >> Please note that Dorian is doing this as part of a work with his uni
> >> (useful contributions to Open-Source), and I think he would prefer
> >> sending the v2 before June, if that's OK. I can look at rebasing it
> >> later if needed. If there is no need to implement a fallback if the
> >> kernel doesn't support MPTCP, the patch should be smaller, it should be
> >> easy to rebase it.
> > 
> > Yeah, understood. I will have limited availability for the next two
> > weeks but if he needs some guidance to finish something almost done,
> > I'll do my best. It's important to encourage first-time contributors,
> > especially since it's easy to want to give up on a first patch feedback,
> > we've all known that :-)
> 
> Please take your time. As I mentioned in my previous email on the ML,
> Dorian is no longer available to work on that, so I will do my best to
> continue. No reason to rush, more to relax ;-)

Sure, but I'd like to get this merged before 3.1 is out ;-)

Feel free to send your patch(es) with a possibly updated commit messages
here on the list. I don't know if we can make a portable enough regtest
for this, especially if we consider that a silent fallback might lead to
TCP being used.

Thanks!
Willy




Re: [ANNOUNCE] haproxy-3.1-dev5

2024-08-07 Thread Willy Tarreau
Hi Ilya,

On Wed, Aug 07, 2024 at 08:30:46PM +0200,  ??? wrote:
> > HAProxy 3.1-dev5 was released on 2024/08/07. It added 88 new commits
> > after version 3.1-dev4.
> >
> > There were quite a bunch of fixes this time, spread over various areas
> > (h2, analysers, jwt, quic, 0-rtt, queues, traces), though nothing exciting
> > at this point.
> >
> 
> I'm not sure about roadmap of aws-lc support in HAProxy, but from my
> observation it reached parity with QuicTLS
> (*) performance of OpenSSL-1.1.1 (in some cases even 20% faster due to
> assembler implementation)
> (*) being actively developed by AWS (derived from BoringSSL)
> (*) all major QUIC features implemented
> (*) no LTS cycles ((

I totally agree. Its performance is even significantly higher than 1.1.1
(we've collected some measurements and are working on an article about
that). And indeed what's missing is LTS. I would really love it if the
QuicTLS and aws-lc teams could join efforts and try to maintain an LTS
version of this high-quality library! All the QUIC community would benefit
from this!

Willy




[ANNOUNCE] haproxy-3.1-dev5

2024-08-07 Thread Willy Tarreau
s
   Code reports : https://www.haproxy.org/l/code-reports
   Latest builds: https://www.haproxy.org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (23):
  MINOR: quic: delay Retry emission on quic-force-retry
  MEDIUM: quic: implement quic-initial rules
  MINOR: quic: support ACL for quic-initial rules
  MINOR: quic: pass quic_dgram as obj_type for quic-initial rules
  MINOR: quic: implement reject quic-initial action
  MINOR: quic: implement send-retry quic-initial rules
  BUG/MEDIUM: quic: fix invalid conn reject with CONNECTION_REFUSED
  BUG/MEDIUM: quic: prevent conn freeze on 0RTT undeciphered content
  MINOR: flags/mux-quic: decode qcc and qcs flags
  BUG/MINOR: quic: fix fc_rtt/srtt values
  BUG/MIONR: quic: fix fc_lost
  BUG/MINOR: h1: do not forward h2c upgrade header token
  BUG/MINOR: h2: reject extended connect for h2c protocol
  MINOR: quic: convert qc_stream_desc release field to flags
  MINOR: quic: implement function to check if STREAM is fully acked
  BUG/MEDIUM: quic: handle retransmit for standalone FIN STREAM
  MINOR: quic: enforce ACK reception is handled in order
  MINOR: mux-quic: define dump functions for QCC and QCS
  MINOR: mux-quic: implement debug string for logs
  MINOR: quic: dump quic_conn debug string for logs
  MINOR: time: define tot_time structure
  MINOR: mux-quic: measure QCS lifetime and its blocking state
  BUG/MINOR: quic: prevent freeze after early QCS closure

Aurelien DARRAGON (3):
  MEDIUM: sink: assume sft appctx stickiness
  BUG/MEDIUM: server/addr: fix tune.events.max-events-at-once event miss 
and leak
  MINOR: server: ensure max_events_at_once > 0 in server_atomic_sync()

Christopher Faulet (9):
  BUG/MEDIUM: jwt: Clear SSL error queue on error when checking the 
signature
  DOC: config: Add documentation about spop mode for backends
  BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was 
set
  BUG/MEDIUM: mux-pt/mux-h1: Release the pipe on connection error on 
sending path
  BUILD: mux-pt: Use the right name for the sedesc variable
  BUG/MEDIUM: http-ana: Report error on write error waiting for the response
  BUG/MEDIUM: h2: Only report early HTX EOM for tunneled streams
  BUG/MEDIUM: mux-h2: Propagate term flags to SE on error in 
h2s_wake_one_stream
  BUG/MEDIUM: peer: Notify the applet won't consume data when it waits for 
sync

Frederic Lecaille (7):
  BUG/MINOR: quic: Lack of precision when computing K (cubic only cc)
  MINOR: quic: Add information to "show quic" for CUBIC cc.
  MINOR: quic: Dump TX in flight bytes vs window values ratio.
  MINOR: tcp_sample: Move TCP low level sample fetch function to control 
layer
  MINOR: quic: Define ->get_info() control layer callback for QUIC
  BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only)
  BUG/MINOR: quic: Too short datagram during packet building failures 
(aws-lc only)

Ilia Shipitsin (4):
  CI: add weekly QUIC Interop regression against AWS-LC
  CI: harden NetBSD builds by ERR=1
  DEV: coccinelle: add a test to detect unchecked strdup()
  BUG/MINOR: fcgi-app: handle a possible strdup() failure

Valentine Krasnobaeva (9):
  MINOR: cfgparse: add struct cfgfile to represent config in memory
  REORG: tools: move list_append_word to cfgparse
  MINOR: startup: adapt list_append_word to use cfgfile
  MINOR: cfgparse: add load_cfg_in_mem
  MINOR: cfgparse: load_cfg_in_mem: take in account file size
  MINOR: tools: add fgets_from_mem
  MEDIUM: startup: make read_cfg() return immediately on ENOMEM
  MEDIUM: startup: load and parse configs from memory
  MINOR: startup: rename readcfgfile in parse_cfg

William Lallemand (7):
  MEDIUM: ssl/quic: implement quic crypto with EVP_AEAD
  MINOR: quic: rename confusing wording aes to hp
  MEDIUM: quic: add key argument to header protection crypto functions
  MEDIUM: quic: implement CHACHA20_POLY1305 for AWS-LC
  BUG/MEDIUM: ssl: reactivate 0-RTT for AWS-LC
  BUG/MEDIUM: ssl: 0-RTT initialized at the wrong place for AWS-LC
  BUILD: ssl: replace USE_OPENSSL_AWSLC by OPENSSL_IS_AWSLC

Willy Tarreau (26):
  MEDIUM: h1: allow to preserve keep-alive on T-E + C-L
  BUILD: cfgparse-quic: fix build error on Solaris due to missing 
netinet/in.h
  MINOR: queue: add a function to check for TOCTOU after queueing
  BUG/MEDIUM: queue: deal with a rare TOCTOU in assign_server_and_queue()
  BUG/MINOR: stconn: bs.id and fs.id had their dependencies incorrect
  DOC: configuration: fix alphabetical ordering of {bs,fs}.aborted
  MINOR: stconn: add a new pair of sf functions {bs,fs}.debug_str
  MINOR: mux-h2: implement the debug string for logs
  BUG/MINOR: trace/quic: enable conn/session pointer recovery from quic_conn
 

Re: [PR] Create SECURITY.md

2024-08-06 Thread Willy Tarreau
On Tue, Aug 06, 2024 at 09:59:41PM +0200, Nicolas CARPi wrote:
> Hello all,
> 
> I wrote a blog post regarding the availability of this file for the 
> biggest 5000 websites. Have a look if you're interested:
> 
> https://www.deltablot.com/posts/state-of-security-txt/

Thank you Nicolas!

Willy




Re: [PATCH] src/fcgi-app.c: handle strdup failure

2024-08-05 Thread Willy Tarreau
On Tue, Aug 06, 2024 at 08:35:21AM +0200,  ??? wrote:
> I'll provide better script. It is not actual patching, but detection (which
> looks like a patching)

Yes I know, I'm used to that as well. Our other scripts do that, they
provide a patch and that's used to detect in the end. But that's
convenient, particularly when combined with source control management
like git, as it suggests a change that's trivial to edit. The most
important is to locate places that need to be checked by a human, the
proposed fix is only a support for this and occasionally a time saver.

Willy




Re: [PATCH] src/fcgi-app.c: handle strdup failure

2024-08-05 Thread Willy Tarreau
On Tue, Aug 06, 2024 at 05:16:11AM +0200, Willy Tarreau wrote:
> > diff --git a/src/fcgi-app.c b/src/fcgi-app.c
> > index b3a9b7c59..98077b959 100644
> > --- a/src/fcgi-app.c
> > +++ b/src/fcgi-app.c
> > @@ -606,6 +606,8 @@ static int proxy_parse_use_fcgi_app(char **args, int 
> > section, struct proxy *curp
> > if (!fcgi_conf)
> > goto err;
> > fcgi_conf->name = strdup(args[1]);
> > +   if (!fcgi_conf->name)
> > +   goto err;
> > LIST_INIT(&fcgi_conf->param_rules);
> > LIST_INIT(&fcgi_conf->hdr_rules);
> 
> Looks good to me, thanks! I think we can also commit your cocci script
> as dev/coccinelle/strdup.cocci. We'll also mark the fix as BUG/MINOR.

Now merged as two distinct commits, one for the script and one for the fix.
Thank you Ilya!
willy




Re: [PATCH] src/fcgi-app.c: handle strdup failure

2024-08-05 Thread Willy Tarreau
On Mon, Aug 05, 2024 at 09:02:46PM +0200,  ??? wrote:
> updated patch attached (I preferred that instead of sending with "git
> send-email")
> 
> ??, 5 ???. 2024 ?. ? 20:10,  ??? :
> 
> >
> >
> > ??, 5 ???. 2024 ?. ? 20:09, William Lallemand :
> >
> >> On Mon, Aug 05, 2024 at 08:01:39PM +0200,  ??? wrote:
> >> > Subject: Re: [PATCH] src/fcgi-app.c: handle strdup failure
> >> > ??, 5 ???. 2024 ?. ? 19:56, William Lallemand :
> >> >
> >> > > On Mon, Aug 05, 2024 at 07:17:48PM +0200, Ilia Shipitsin wrote:
> >> > > > Subject: [PATCH] src/fcgi-app.c: handle strdup failure
> >> > > > found by coccinelle
> >> > >
> >> > > Please add clearer commit messages in your patches, you tend to
> >> minimize
> >> > > them, thanks ! :-)
> >> > >
> >> >
> >> > truth to be told, I tend to write longer messages than usually :)
> >> >
> >> >
> >> > >
> >> > > > ---
> >> > > >  src/fcgi-app.c | 5 -
> >> > > >  1 file changed, 4 insertions(+), 1 deletion(-)
> >> > > >
> >> > > > diff --git a/src/fcgi-app.c b/src/fcgi-app.c
> >> > > > index b3a9b7c59..d96bb222c 100644
> >> > > > --- a/src/fcgi-app.c
> >> > > > +++ b/src/fcgi-app.c
> >> > > > @@ -606,6 +606,8 @@ static int proxy_parse_use_fcgi_app(char
> >> **args, int
> >> > > section, struct proxy *curp
> >> > > >   if (!fcgi_conf)
> >> > > >   goto err;
> >> > > >   fcgi_conf->name = strdup(args[1]);
> >> > > > + if (!fcgi_conf->name)
> >> > > > + goto err;
> >> > > >   LIST_INIT(&fcgi_conf->param_rules);
> >> > > >   LIST_INIT(&fcgi_conf->hdr_rules);
> >> > > >
> >> > > > @@ -622,7 +624,8 @@ static int proxy_parse_use_fcgi_app(char
> >> **args, int
> >> > > section, struct proxy *curp
> >> > > >   return retval;
> >> > > >err:
> >> > > >   if (fcgi_conf) {
> >> > > > - free(fcgi_conf->name);
> >> > > > + if (fcgi_conf->name)
> >> > > > + free(fcgi_conf->name);
> >> > >
> >> > > You don't need to add a check there, free(NULL) does nothing.
> >> > >
> >> >
> >> > I tried to figure out whether that behaviour is by chance or by purpose
> >> > (taking into account variety of implementations on different OSes)
> >> >
> >> > I'll try again
> >> >
> >>
> >> This is a standard behavior that is specified in POSIX, honestly HAProxy
> >> won't probably run if it wasn't behaving this
> >> way on some weird OS.
> >>
> >> https://pubs.opengroup.org/onlinepubs/009695399/functions/free.html "If
> >> ptr is a null pointer, no action shall occur."
> >>
> >
> > "man 3 free" didn't give me any link to POSIX.
> > I've should have a look at POSIX directly, but I didn't
> >
> >
> >>
> >> --
> >> William Lallemand
> >>
> >

> From baa55f1434b2ed5288536f62b8e322ed20904b8d Mon Sep 17 00:00:00 2001
> From: Ilia Shipitsin 
> Date: Mon, 5 Aug 2024 20:59:09 +0200
> Subject: [PATCH] src/fcgi-app.c: handle strdup failure
> 
> this defect is found by the following coccinelle script:
> 
> // find calls to strdup
> @call@
> expression ptr;
> position p;
> @@
> 
> ptr@p = strdup(...);
> 
> // find ok calls to strdup
> @ok@
> expression ptr;
> position call.p;
> @@
> 
> ptr@p = strdup(...);
> ... when != ptr
> (
>  (ptr == NULL || ...)
> |
>  (ptr == 0 || ...)
> |
>  (ptr != NULL || ...)
> |
>  (ptr != 0 || ...)
> )
> 
> // fix bad calls to strdup
> @depends on !ok@
> expression ptr;
> position call.p;
> @@
> 
> ptr@p = strdup(...);
> + if (ptr == NULL) return;
> ---
>  src/fcgi-app.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/fcgi-app.c b/src/fcgi-app.c
> index b3a9b7c59..98077b959 100644
> --- a/src/fcgi-app.c
> +++ b/src/fcgi-app.c
> @@ -606,6 +606,8 @@ static int proxy_parse_use_fcgi_app(char **args, int 
> section, struct proxy *curp
>   if (!fcgi_conf)
>   goto err;
>   fcgi_conf->name = strdup(args[1]);
> + if (!fcgi_conf->name)
> + goto err;
>   LIST_INIT(&fcgi_conf->param_rules);
>   LIST_INIT(&fcgi_conf->hdr_rules);

Looks good to me, thanks! I think we can also commit your cocci script
as dev/coccinelle/strdup.cocci. We'll also mark the fix as BUG/MINOR.

Thanks!
Willy




Re: [PR] Create SECURITY.md

2024-08-05 Thread Willy Tarreau
On Sat, Aug 03, 2024 at 12:23:03PM +, PR Bot wrote:
> Dear list!
> 
> Author: Valen1393 
> Number of patches: 1
> 
> This is an automated relay of the Github pull request:
>Create SECURITY.md
> 
> Patch title(s): 
>Create SECURITY.md
> 
> Link:
>https://github.com/haproxy/haproxy/pull/2661
> 
> Edit locally:
>wget https://github.com/haproxy/haproxy/pull/2661.patch && vi 2661.patch
> 
> Apply locally:
>curl https://github.com/haproxy/haproxy/pull/2661.patch | git am -
> 
> Description:
>Sabar

Not very descriptive commit message... Anyway, I don't see any value in
this. First, the address to contact for security reports is already
present in the doc (and some find it since we occasionally receive
messages there). Second, the list of supported versions is already on
the web site, and the maintenance status of each version as well as the
planned EOL is even reported in the executables themselves, so at best
this new file would add nothing, at worst it would add extra maintenance
and possible confusion when out of sync.

Thanks,
Willy




Re: [PR] Create SECURITY.md

2024-08-05 Thread Willy Tarreau
Hi Nicolas,

[ dropped security@, it's not too much spammed yet, I prefer to limit
  exposure ]

On Sun, Aug 04, 2024 at 08:20:33PM +0200, Nicolas CARPi wrote:
> Hello list,
> 
> This PR made me think about the new security.txt standard - or at least 
> proposed standard: https://securitytxt.org/
> 
> Basically, you serve a text file at .well-known/security.txt, and this 
> should be the first place to look for a contact to send security reports 
> to the dev team by security researchers.
> 
> It is also where entities such as ANSSI will try and contact you, as 
> explained on this page: https://www.cert.ssi.gouv.fr/signalements/. I've 
> seen their bot in my logs trying to fetch this URL, which prompted me to 
> add it to my websites.
> 
> This message is for the maintainers of haproxy.org, but for anyone 
> reading and interested in adding it, this is what I use:
> 
> acl securitytxt-acl path_beg /.well-known/security.txt
> http-request return status 200 content-type text/plain string "Contact: 
> mailto:y...@company.com\n[RestOfTheFile]\n"; if securitytxt-acl

That's indeed a simple way of implementing it, thanks for sharing. I'm
not much convinced myself by this approach however. Nobody knows about
it, and only a handful of sites are using it. I personally think that
there's nothing special in the "security" aspect that warrants such a
name, and that having instead a contact list for various purposes,
including security would be more beneficial. Sometimes you'd just like
to contact the webmaster to signal a bug, or you might want to report
a security issue in a product, which is independent from the site, etc.

Thanks,
Willy




Re: [ANNOUNCE] haproxy-3.0.3

2024-08-04 Thread Willy Tarreau
On Sun, Aug 04, 2024 at 09:39:25PM +0200, Vincent Bernat wrote:
> 
> On 2024-07-22 21:59, Willy Tarreau wrote:
> 
> > > > HAProxy 3.0.3 was released on 2024/07/11. It added 42 new commits
> > > > after version 3.0.2.
> > > 
> > > I am late releasing this version on haproxy.debian.net. I have issues with
> > > compiling for ARM64. There is a known bug in recent version of QEMU
> > > segfaulting on ARM64 userland emulation. Previously, I was able to just
> > > downgrade to an older version of QEMU, but this does not work anymore. I
> > > suppose there was some changes in recent kernel versions. I'll need to try
> > > again in a VM.
> > 
> > OK, thanks for letting us know!
> > 
> > I could probably propose you an access to one physical machine, but I
> > guess this would not help much because it would likely completely
> > change your build process. Do not hesitate to let us know if there's
> > anything we can do to help you.
> 
> It seems the segfault is random while using user emulation. So, I have just
> to try a few times... Hoping it will be fixed at some point. Until then,
> 3.0.3 packages for Debian/Ubuntu are online.

OK, thank you Vincent!
Willy




Re: Bug Report

2024-08-01 Thread Willy Tarreau
Hello,

On Thu, Aug 01, 2024 at 09:09:18PM +0500, Jenny Rose wrote:
> Hi Team,
> I hope you are well.
> 
> I would like to share another vulnerability of your website
> 
> Vulnerability 1: Non - secure requests are not automatically upgraded to
> HTTPS | HSTS missing
> 
> Description
> 
> The application fails to prevent users from connecting to it over
> unencrypted connections.
(...)

Supporting clear connections is an absolute necessity for a huge
amount of users, and by the way you would probably not be able to
validate some certificates, download some SSL libraries nor some
browsers if all sites were enforcing encryption everywhere.

This has nothing to do with a vulnerability, it works as designed and
as desired. Anyone is free to bind or not to clear ports, there's no
automatic binding, and configurations to apply various redirect options
are widely available.

Last, there are users who are extremely cautious about HSTS due to the
way they create "super cookies" that track a user forever and that, with
some browsers, even allows to track the user across multiple devices.

Thanks,
Willy




Re: Some HTTP connections not closing properly on Haproxy 2.8.10

2024-07-26 Thread Willy Tarreau
Hi Jens,

On Fri, Jul 26, 2024 at 07:51:47PM +0200, Jens Wahnes wrote:
> Hi everyone,
> 
> I'm trying to move from Haproxy 2.4 to 2.8 and encountered some trouble with
> 2.8 that did not occur on 2.4.
> Specifically, this seems to concern plain HTTP connections only, i.e.
> non-HTTPS traffic. I have not seen an example of this happening with HTTPS
> connections. However, it could also be that it is specific to HTTP/1 as
> opposed to HTTP/2 (we do not have any unencrypted HTTP/2 connections).
> 
> Some of the backend connections are not properly closed with Haproxy 2.8 and
> keep lingering for hours or days. That is, the client connection is gone,
> but the backend connection is still there in state CLOSE_WAIT. This in turn
> causes trouble with seamless reloads: the old Haproxy processes never
> terminate and as several newer ones pile up, they take up memory.
> 
> I have not experienced this with Haproxy 2.4.27. On 2.4.27, these backend
> TCP connections were properly closed. It only began to be a problem with
> never versions like 2.8.10.
> 
> To give an example, a connection stuck in CLOSE_WAIT may look this in `ss`
> output:
> 
> ```
> CLOSE-WAIT 0  0  172.16.240.53:48528
> 172.16.240.86:2000 users:(("haproxy",pid=1099145,fd=239))
> ```
> 
> In Haproxy's own `show sess`, the above example would look like this:
> 
> ```
> 0x7efc6b7b5400: [26/Jul/2024:18:24:50.308214] id=237988 proto=tcpv4
> source=10.22.113.130:50205
>   flags=0x53c0a, conn_retries=0, conn_exp= conn_et=0x000
> srv_conn=(nil), pend_pos=(nil) waiting=0 epoch=0
>   frontend=Loadbalancer (id=48 mode=http), listener=http (id=5)
> addr=10.210.18.56:80
>   backend=_std_ (id=3 mode=http) addr=172.16.240.53:48528
>   server=hermes (id=1) addr=172.16.240.86:2000
>   task=0x7efc72205ec0 (state=0x00 nice=0 calls=6 rate=0 exp=
> tid=1(1/1) age=17m41s)
>   txn=0x7efc6bf72aa0 flags=0x0 meth=1 status=403 req.st=MSG_CLOSED
> rsp.st=MSG_CLOSING req.f=0x44 rsp.f=0x0d
>   scf=0x7efc7233e8a0 flags=0x00030406 state=CLO
> endp=CONN,0x7efc6d080ec0,0x04014a01 sub=0 rex= wex=
>   h1s=0x7efc6d080ec0 h1s.flg=0x104010 .sd.flg=0x4014a01
> .req.state=MSG_DONE .res.state=MSG_DATA
>   .meth=GET status=403 .sd.flg=0x04014a01 .sc.flg=0x00030406
> .sc.app=0x7efc6b7b5400 .subs=(nil)
>   h1c=0x7efc6d5f5d40 h1c.flg=0x20d00 .sub=0 .ibuf=0@(nil)+0/0
> .obuf=0@(nil)+0/0 .task=0x7efc6d004360 .exp=
>   co0=0x7efc7230f140 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM
> target=LISTENER:0x7efc712d5e00
>   flags=0x801c0300 fd=194 fd.state=1c22 updt=0 fd.tmask=0x2
>   scb=0x7efc6b7c85e0 flags=0x0003501b state=CLO
> endp=CONN,0x7efc6d095bc0,0x05004a01 sub=0 rex= wex=
>   h1s=0x7efc6d095bc0 h1s.flg=0x4010 .sd.flg=0x5004a01
> .req.state=MSG_DONE .res.state=MSG_DONE
>   .meth=GET status=403 .sd.flg=0x05004a01 .sc.flg=0x0003501b
> .sc.app=0x7efc6b7b5400 .subs=(nil)
>   h1c=0x7efc6d087280 h1c.flg=0x8100 .sub=0 .ibuf=0@(nil)+0/0
> .obuf=0@(nil)+0/0 .task=0x7efc6d052500 .exp=
>   co1=0x7efc723ede00 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM
> target=SERVER:0x7efc71e48000
>   flags=0x00040300 fd=239 fd.state=1122 updt=0 fd.tmask=0x2
>   req=0x7efc6b7b5420 (f=0x848000 an=0x4 pipe=0 tofwd=0 total=113)
>   an_exp= buf=0x7efc6b7b5428 data=(nil) o=0 p=0 i=0 size=0
>   htx=0x8c65c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
>   res=0x7efc6b7b5470 (f=0xa024 an=0x2400 pipe=2310 tofwd=0
> total=25601)
>   an_exp= buf=0x7efc6b7b5478 data=(nil) o=0 p=0 i=0 size=0
>   htx=0x8c65c0 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0
> ```
> 
> Issuing a `shutdown session` command does not resolve the issue (e.g.
> `shutdown session 0x7efc6b7b5400` in this example). So far, killing the
> whole process has been the only way for me to get rid of these leftover
> connections.

First, thanks a lot for all the details. I'm seeing a non-zero "pipe"
value above, indicating you're using splicing between the two sides.
Could you please comment whatever "option splice*" you have in your
config so that we can confirm whether it's related or not ?

> So my question is: does this ring a bell for anyone? Maybe something that
> has been fixed in newer versions (like 3.0)?

No, it really doesn't ring any bell right now. In any case if such a bug
had been identified, the fix would have been backported to affected
versions. I'm not seeing anything related post-2.8 either.

However I'm seeing a few post-2.9 changes related to splicing, so it may
be possible that it has accidently been fixed. As such it would be very
useful if you could test 2.9 or 3.0, as this could help us figure whether
it is only 2.8-specific or affects any version starting from 2.8.

Also, if you want to experiment something regarding the possibilities to
close such FDs, you can spot the FD number after the second "fd==" in
"show sess" (backend side), and forcefully close it by issuing:

  expert-mode on; debug dev clode XXX

where XXX is the FD number. Then issue the "sho

Re: [ANNOUNCE] haproxy-3.1-dev2

2024-07-26 Thread Willy Tarreau
On Fri, Jul 26, 2024 at 10:40:46AM +0200,  ??? wrote:
> > > next point: 0-RTT
> >
> > Sorry for the stupid questions but as I don't maintain any QUIC based apps
> > I'm
> > curious if this really makes any differences compared to "normal" requests.
> >
> > How often is this really used in the real world setup?
> > Have anybody seen any real benefit for users out there with 0-RTT?
> >
> 
> very beneficial
> Introducing Zero Round Trip Time Resumption (0-RTT) (cloudflare.com)
> 

When we were wondering whether or not to adopt the compat layer for
openssl that was suggested by NGINX and that does not support 0-RTT,
we asked the IETF QUIC WG what their opinion on the subject was, and
it was unanimous: "don't do that!". They considered that exposing QUIC
without 0-RTT would hurt the protocol's image because it's expected to
be used a lot. I think it can vary depending on whether users visit a
site often or not (e.g multiple times a day or once a day).

We know that on the OpenSSL front it will remain limited. But for libs
that don't need hacks, it's definitely important that we try to support
the protocol fully.

Willy




Re: [ANNOUNCE] haproxy-3.1-dev4

2024-07-24 Thread Willy Tarreau
Hi Alex,

On Wed, Jul 24, 2024 at 10:32:16PM +0200, Aleksandar Lazic wrote:
> >- SPOE: the old applet-based architecture was replaced with the new
> >  mux-based one which allows idle connections sharing between threads,
> >  as well as queuing, load balancing, stickiness etc per request instead
> >  of per-connection and adds a lot of flexibility to the engine. We'd
> >  appreciate it a lot if SPOE users would take some time to verify that
> >  it works at least as well for them as before (and hopefully even
> >  better). Some good ideas may spark. Please check Christopher's
> >  response to the SPOE thread for more info.
> 
> Cool. Thank you that you handle this topic, even I don't use it for now :-)

Hehe, who knows, maybe one day you'll have a good use for it :-)

> >- ocsp: some processing was refined to better handle a corner case where
> >  the issuer chain is not in the same PEM file, though it also slightly
> >  changes how this is handled on the CLI.
> 
> [snipp]
> 
> Does this announcement have any impact to HAProxy?
> 
> "Intent to End OCSP Service"
> https://letsencrypt.org/2024/07/23/replacing-ocsp-with-crls.html
> https://news.ycombinator.com/item?id=41046956

I noticed it on LWN today but I really have no idea. I'll let the SSL
experts chime in.

Cheers,
Willy




Test, please ignore

2024-07-24 Thread Willy TARREAU
Just testing the new incoming mail gateway. Please ignore.

Willy




[ANNOUNCE] haproxy-3.1-dev4

2024-07-24 Thread Willy Tarreau
ence removed CPU arg

Valentine Krasnobaeva (20):
  MINOR: limits: prepare to keep limits in one place
  REORG: fd: move raise_rlim_nofile to limits
  CLEANUP: fd: rm struct rlimit definition
  REORG: global: move rlim_fd_*_at_boot in limits
  MINOR: haproxy: prepare to move limits-related code
  REORG: haproxy: move limits handlers to limits
  MINOR: limits: add is_any_limit_configured
  BUG/MINOR: limits: fix license type in limits.h
  MINOR: debug: prepare feed_post_mortem_late
  CLEANUP: debug: fix indents in debug_parse_cli_show_dev
  MINOR: debug: store runtime uid/gid in postmortem
  MINOR: debug: keep runtime capabilities in post_mortem
  MINOR: debug: use LIM2A to show limits
  MINOR: debug: prepare to show runtime limits
  MINOR: debug: keep runtime limits in postmortem
  BUG/MEDIUM: ssl_sock: fix deadlock in ssl_sock_load_ocsp() on error path
  MEDIUM: ocsp: fix ocsp when the chain is loaded from 'issuers-chain-path'
  BUG/MEDIUM: startup: fix zero-warning mode
  MINOR: cfgparse-global: move mode's keywords in cfg_kw_list
  MINOR: cfgparse-global: move no in cfg_kw_list

William Lallemand (5):
  MEDIUM: ssl: add extra_chain to ckch_data
  MINOR: ssl: change issuers-chain for show_cert_detail()
  REGTESTS: ssl: test the issuers-chain-path keyword
      DOC: configuration: issuers-chain-path not compatible with OCSP
  DOC: configuration: issuers-chain-path is compatible with OCSP

Willy Tarreau (5):
  BUILD: mux-spop: fix build failure on gcc 4-10 and clang
  MINOR: fd: don't scan the full fdtab on all threads
  BUG/MEDIUM: debug/cli: fix "show threads" crashing with low thread counts
  BUG/MAJOR: mux-h2: force a hard error upon short read with pending error
  DOC: config: improve the http-keep-alive section

---



Re: About the SPOE

2024-07-24 Thread Willy Tarreau
On Wed, Jul 24, 2024 at 03:48:15PM +0200, Christopher Faulet wrote:
> As announced, the SPOE was finally refactored. This new SPOE will be shipped
> with the 3.1-dev4. It is a full rewrite of the engine, based on a dedicated
> SPOP multiplexer. It means a "spop" proxy mode, used for SPOE backends, was
> added. It is now the default mode for backends used by SPOE engines. "mode
> spop" can be explicitly specified, but otherwise, the mode is detected
> during post-parsing. So, these backends cannot be mixed for a raw TCP usage.
> 
> Thanks to this refactoring, SPOP connections are now properly reused,
> similarly to HTTP connections. So, it is now possible to properly balance
> connections when several servers are configured. The management of IDLE
> connections used on HTTP backends is used for SPOP connections. It is a huge
> improvement compared to what was performed by hand with the pool of SPOE
> applets.

In addition to these, what's important to understand is that previously,
connections were load-balanced between agents while now it's messages.
Thus the control is much finer. For example "balance leastconn" can make
some sense for always sending a request to the least loaded agent;
"balance hash var(ptxn.XXX)" can make sense as well to perform some
affinity and improve context reuse on the backend, and so on.

It's never easy to enumerate all possibilities offered by switching from
per-connection processing to per-request processing, but it's basically
the same as we gained long ago by switching from forwarding SSL as raw
TCP connections and later decoding SSL to process each individual request!

That's why I second Christopher regarding this quest for testers. It's
possible some will feel that something tiny is missing to achieve a great
step forward in their use cases, it'd better be expressed early ;-)

Willy



Re: [ANNOUNCE] haproxy-3.0.3

2024-07-22 Thread Willy Tarreau
Hi Vincent,

On Mon, Jul 22, 2024 at 09:25:45PM +0200, Vincent Bernat wrote:
> On 2024-07-11 16:51, Willy Tarreau wrote:
> > Hi,
> > 
> > HAProxy 3.0.3 was released on 2024/07/11. It added 42 new commits
> > after version 3.0.2.
> 
> I am late releasing this version on haproxy.debian.net. I have issues with
> compiling for ARM64. There is a known bug in recent version of QEMU
> segfaulting on ARM64 userland emulation. Previously, I was able to just
> downgrade to an older version of QEMU, but this does not work anymore. I
> suppose there was some changes in recent kernel versions. I'll need to try
> again in a VM.

OK, thanks for letting us know!

I could probably propose you an access to one physical machine, but I
guess this would not help much because it would likely completely
change your build process. Do not hesitate to let us know if there's
anything we can do to help you.

Cheers,
Willy



Re: [ANNOUNCE] haproxy-3.1-dev3 (more infos on the story with fd-hard-limit and systemd)

2024-07-17 Thread Willy Tarreau
On Wed, Jul 17, 2024 at 02:06:03PM +0200, Lukas Tribus wrote:
> On Wed, 17 Jul 2024 at 11:25, Willy Tarreau  wrote:
> >
> > At this point, do you (or anyone else) still have any objection against
> > backporting the DEFAULT_MAXFD patch so as to preserve the current
> > defaults for users, and/or do you have any alternate proposal, or just
> > want to discuss other possibilities ?
> 
> No, I don't have objections.

OK thank you!

Willy



Re: [ANNOUNCE] haproxy-3.1-dev3 (more infos on the story with fd-hard-limit and systemd)

2024-07-17 Thread Willy Tarreau
Hi Lukas,

On Tue, Jul 16, 2024 at 11:28:12PM +0200, Lukas Tribus wrote:
> Hi Valentine, hi Willy,
> 
> after spending some time testing I agree tuning maxconn/fd-limits is hard ...

In fact we know that it's hard for experts, and it's even harder for
new users.

> With 8GB RAM we can still OOM with 1M FDs / 500k maxconn (no TLS), but
> it appears to be around the sweetspot.

It happens to be the previous common limit, with plenty of distros
offering 1M FDs by default. For me it's already large, I would be
fine with a much lower value, but lowering a limit could possibly
break some use cases, and given that basically nobody complained
about OOM in regular usages recently, it was sort of a confirmation
that the default 1M that was applied for quite some time was OK for
most users.

> It thought it would require more memory considering that we suggest
> 1GB of memory for 20k non-TLS connections or 8k TLS connections, but
> my test was indeed synthetic with zero features used, and it's not
> only about haproxy userspace but the system as well.

Yes, it's really a whole. Its even more difficult nowadays because
it depends if data stay stuck in system buffers or not. Haproxy
aggressively recycles its buffers, and since 3.0 it even avoids
reading data if some are stuck downstream so it tries to be very
memory efficient, but we know that memory usage can increase a lot
with SSL and congested links.

> lukas@dev:~/haproxy$ git grep -B3 -A1 "GB of RAM"
(...)

I think we might be doing better than before on this but I don't
want to encourage users to put more connections on low memory. RAM
is so cheap that the smallest VM you can find has 2GB of RAM for a
single CPU core, so better encourage them to have enough RAM to
evacuate the data rather than risking OOM.

I would *love* to find a reasonable way to autosize all this based
on available RAM, but there's no portable solution to this, which
means that whatever hack we'll do would still require a fallback
anyway.

> > What I would really like is to no longer see any maxconn in a regular
> > configuration because there's no good value and we've seen them copied
> > over and over.
> 
> By setting a global maxconn you force yourself to at least think about it.

I totally agree with that. However our experience (particularly from
discussing appliance sizing with potential customers) is that nobody
has any idea of their maxconn, nor even within an order of magnitude.
Most users usually know more or less what their bandwidth is, sometimes
number of visitors per day, and that's about all. We really need to
consider that those who post here and on github are the exception and
not the norm. Another factor that comes into play is how connections
are accounted. Those coming from other LBs acting at a lower layer
(e.g. LVS) count the total number of connections tracked in the LB's
table, which includes TIME_WAIT, which can represent 90-99% depending
on the workloads! That's how you hear that port 80 in charge of the
redirect of a web site working exclusively in SSL, that gets 1000
conn/s requires 6 concurrent connections while the reality is
rather less than 100!

> I'm assuming by "regular configuration" you mean small scale/size? In
> this case I agree.

Yes that was it. The vast majority of users have way less than even
100 req/s on average and technical parameters like maxconn are the
least of their concerns.

> > I'm confused now, I don't see how, given that the change only *lowers*
> > an existing limit, it never raises it. It's precisely because of the
> > risk of OOM with OSes switching the default from one million FDs to one
> > billion that we're proposing to keep the previous limit of 1 million as
> > a sane upper bound. The only risk I'm seeing would be users discovering
> > that they cannot accept more than ~500k concurrent connections on a large
> > system. But I claim that those dealing with such loads *do* careful size
> > and configure their systems and services (RAM, fd, conntrack, monitoring
> > tools etc). Thus I'm not sure which scenario you have in mind that this
> > change could result in such a report as above.
> 
> True, I confused memory required for initialization with memory
> allocated when actually used.

OK no pb!

> On Thu, 11 Jul 2024 at 14:44, Willy Tarreau  wrote:
> >
> > My take on this limit is that most users should not care. Those dealing
> > with high loads have to do their homework and are used to doing this,
> > and those deploying in extremely small environments are also used to
> > adjusting limits (even sometimes rebuilding with specific options), and
> > I'm fine with leaving a bit of work for both extremities.
> 
> Considering ho

Re: [PATCH] DOC: install: don't reference removed CPU arg

2024-07-16 Thread Willy Tarreau
On Tue, Jul 16, 2024 at 05:47:50PM +, Lukas Tribus wrote:
> Remove reference to the removed CPU= build argument in commit 018443b8a1
> ("BUILD: makefile: get rid of the CPU variable").

Oops, good catch, thank you Lukas! I've marked it for backporting to 3.0
as well. Now merged, thanks,
Willy



Re: [PATCH 1/1]: BUILD/MINOR: haproxy: fix SO_LINGER usage on macOs.

2024-07-15 Thread Willy TARREAU
On Tue, Jul 16, 2024 at 05:27:10AM +0100, David CARLIER wrote:
> Hi you are right I did not check properly, had a brain fog :) we can forget
> it. Thanks.

OK perfect, thanks :-)

Willy



Re: [PATCH 1/1]: BUILD/MINOR: haproxy: fix SO_LINGER usage on macOs.

2024-07-15 Thread Willy TARREAU
Hi David!

On Mon, Jul 15, 2024 at 10:29:48PM +0100, David CARLIER wrote:
> Hi here a little patch proposal targeted for macOs.
> 
> Cheers.

> From df5741a0d391a7107157d0051ba81ef48d87b8f5 Mon Sep 17 00:00:00 2001
> From: David Carlier 
> Date: Mon, 15 Jul 2024 22:20:33 +0100
> Subject: [PATCH] BUILD/MEDIUM: haproxy : fix SO_LINGER usage on macOS.
> 
> SO_LINGER on macOS works in term of clock ticks rather than seconds
> leading to behavior differences. SO_LINGER_SEC is available to
> be more in line so we create HA_SO_LINGER internal.

How would that make any difference given that we're using it *only*
with zero ? From what I'm seeing both use the same "linger" struct
so 0 ticks or 0 seconds shouldn't make any difference. What different
behavior did you observe there ?

thanks!
Willy



[ANNOUNCE] haproxy-3.0.3

2024-07-11 Thread Willy Tarreau
 BUG/MINOR: server: fix first server template name lookup UAF
  BUG/MEDIUM: server/dns: prevent DOWN/UP flap upon resolution timeout or 
error

Christopher Faulet (9):
  BUG/MEDIUM: stick-table: Decrement the ref count inside lock to kill a 
session
  BUG/MINOR: promex: Remove Help prefix repeated twice for each metric
  BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers
  BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without 
appctx
  BUG/MINOR: h1: Fail to parse empty transfer coding names
  BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value
  BUG/MEDIUM: h1: Reject empty Transfer-encoding header
  BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current 
thread
  BUG/MEDIUM: bwlim: Be sure to never set the analyze expiration date in 
past

Valentine Krasnobaeva (2):
  MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD
  DOC: configuration: update maxconn description

William Lallemand (6):
  REGTESTS: ssl: fix some regtests 'feature cmd' start condition
  DOC: configuration: fix alphabetical order of bind options
  DOC: configuration: add details about crt-store in bind "crt" keyword
  DOC: configuration: more details about the master-worker mode
  BUG/MINOR: jwt: don't try to load files with HMAC algorithm
  BUG/MINOR: jwt: fix variable initialisation

Willy Tarreau (4):
  DEV: flags/show-fd-to-flags: adapt to recent versions
  MINOR: activity: make the memory profiling hash size configurable at 
build time
  BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking
  Revert "MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD"

---



Re: [ANNOUNCE] haproxy-3.1-dev3

2024-07-11 Thread Willy Tarreau
Hi Lukas,

On Thu, Jul 11, 2024 at 12:17:53PM +0200, Lukas Tribus wrote:
> Hi,
> 
> I will get back to this for further research and discussion in about a week.

OK! In the mean time I'll revert the pending patches from 3.0 so that
we can issue 3.0.3 without them.

> In the meantime, do we agree that the environment we are developing the fix
> for is the following:
> 
> the hard limit is always set to the maximum available in the kernel which
> on amd64 is one billion with a B, whether the systems has 128M or 2T of
> memory is irrelevant.
> 
> You agree that this is the environment systemd sets us up with, right?

I'm having a doubt by not being certain I'm parsing the question correctly :-)

Just to rephrase the goal here, it's to make sure that when the service
is started with extreme (and absurd) limits, we don't use all of what is
offered but only a smaller subset that matches what was commonly encountered
till now.

So if we start with 1B fd regardless of the amount of RAM, we want to
limit to a lower value so as not to OOM for no reason. One FD takes
64 bytes. Starting haproxy with 1M FDs here takes ~80 MB, which I
consider not that extreme by todays sizes, and that in practice few
people have reported problems about. Of course establishing connections
on all of these will use way more memory but that already depends on
the traffic pattern and configuration (SSL front/back, etc).

My take on this limit is that most users should not care. Those dealing
with high loads have to do their homework and are used to doing this,
and those deploying in extremely small environments are also used to
adjusting limits (even sometimes rebuilding with specific options), and
I'm fine with leaving a bit of work for both extremities.

Willy



Re: [ANNOUNCE] haproxy-3.1-dev3

2024-07-10 Thread Willy Tarreau
Hi Lukas,

and first, many thanks for sharing your thoughts and opinions on this.

[ responding to both of your messages at once ]

On Wed, Jul 10, 2024 at 09:30:55PM +0200, Lukas Tribus wrote:
> On Wed, 10 Jul 2024 at 16:39, Willy Tarreau  wrote:
> >
> > Another change that will need to be backported after some time concerns
> > the handling of default FD limits.
(...)
> I wholeheartedly hate default implicit limits and I also pretty much
> disagree with fd-hard-limit in general, but allow me to quote your own
> post here from github issue #2043 comment
> https://github.com/haproxy/haproxy/issues/2043#issuecomment-1433593837

I don't like having had to deal with such limits for ~23 years now but
the facts is that it's one of the strict, non-bypassable, system-imposed
limits. The problem is that while the vast majority of users don't care
about the number of FDs, this value cannot be changed at runtime and does
have serious implications on RAM usage and even ou ability to accept
connections that we're engaging in processing cleanly. So in any case we
need to respect a limit, and for this we have to compose with what operating
systems are doing.

For decades they would present 1024 soft and more hard, but not that much
(e.g. 4k) and it was needed to start as root to go beyond. Then some OSes
started to expose much higher hard values by default (256k to 1M) so that
it was no longer requird to be root to start a service. During this time,
such limits were more or less tailored around RAM sizing. Now it seems
we're reaching a limit, with extreme values being advertised without any
relation with allocated RAM. I think that containers are part of the cause
of this.

> > we used to have a 2k maxconn limit for a very long time and it was causing
> > much more harm than such an error: the process used to start well and was
> > working perfectly fine until the day there was a big rush on the site and it
> > wouldn't accept more connections than the default limit. I'm not that much
> > tempted by setting new high default limits. We do have some users running
> > with 2+ million concurrent connections, or roughly 5M FDs. That's already 
> > way
> > above what most users would consider an acceptable default limit, and 
> > anything
> > below this could mean that such users wouldn't know about the setting and 
> > could
> > get trapped.
> 
> I disagree that we need to heuristically guess those values like I
> believe I said in the past.

My problem is that such limit *does* exist (and if you look at ulimit -a,
it's one of the rare ones that's never unlimited), we have to apply a
value, with too low one we reject traffic at the worst possible moments
(when there are the most possible witnesses of your site falling down)
and with too high one we cannot start anymore. Limits are imposed to the
process and it needs to work within.

> "But containers ..." should not be an argument to forgo the principle
> of least surprise.

I agree even though they're part of the problem (but no longer the only
one).

> There are ways to push defaults like this out if really needed: with
> default configuration files, like we have in examples/ and like
> distributions provide in their repositories. This default the users
> will then find in the configuration file and can look it up in the
> documentation if they want.

I'm not against encouraging users to find sane limits in the config
files they copy-paste all over the place. Just like I think that if
systemd starts to advertise very large values, we should probably
encourage to ship unit files setting the hard limit to 1M or so (i.e.
the previously implicit hard value presented to the daemon).

> At the very least we need a *stern* configuration warning that we now
> default to 1M fd, although I would personally consider this (lack of
> all fd-hard-limit, ulimit and global maxconn) leading to heuristic
> fd-hard-limit a critical error.

When Valentine, who worked on the patch, talked to me about the problem,
these were among the possibilities we thought about. I initially
disagreed with the error because I considered that having to set yet
another limit to keep your config running is quite a pain (a long time
ago users were really bothered by the relation between ulimit-n and
maxconn). But I was wrong on one point, I forgot that fd-hard-limit only
applies when maxconn/ulimit-n etc are not set, so that wouldn't affect
users who already set their values correctly.

> I also consider backporting this change - even with a configuration
> warning - dangerous.

I know, but we don't decide what distro users start their stable version
on :-/

> So here a few proposals:
> 
> Proposal 1:
> 
> - remove fd-hard-limit as it was a confusing mistake 

Re: [ANNOUNCE] haproxy-3.1-dev3

2024-07-10 Thread Willy Tarreau
On Wed, Jul 10, 2024 at 06:49:52PM +0200, Aleksandar Lazic wrote:
> 
> 
> On 2024-07-10 (Mi.) 16:39, Willy Tarreau wrote:
> > Hi,
> > 
> > HAProxy 3.1-dev3 was released on 2024/07/10. It added 35 new commits
> > after version 3.1-dev2.
> 
> [snipp]
> 
> > And I'm still trying to free some time for the pending reviews (I have not
> > forgotten you but stuff that depends on multiple persons cannot always
> > wait).
> 
> There is no hurry about the connect patch. I have created in the meantime
> another solution in rust. :-)
> 
> It's no the best code in the world but it solves my issue.
> https://github.com/git001/tls-proxy-tunnel/
> 
> So take your time for review.

Thanks for your kind understanding, Alex!
Willy



[ANNOUNCE] haproxy-3.1-dev3

2024-07-10 Thread Willy Tarreau
ic_cid module
  REORG: quic: remove quic_cid_trees reference from proto_quic
  MINOR: quic: add 2 BUG_ON() on datagram dispatch
  MINOR: quic: ensure quic_conn is never removed on thread affinity rebind
  MINOR: proto: extend connection thread rebind API
  BUG/MEDIUM: quic: prevent crash on accept queue full
  DEV: flags/quic: decode quic_conn flags
  MINOR: quic: rename "ssl error" trace

Christopher Faulet (7):
  BUG/MINOR: promex: Remove Help prefix repeated twice for each metric
  BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work with applet's buffers
  BUG/MEDIUM: peers: Fix crash when syncing learn state of a peer without 
appctx
  BUG/MINOR: h1: Fail to parse empty transfer coding names
  BUG/MINOR: h1: Reject empty coding name as last transfer-encoding value
  BUG/MEDIUM: h1: Reject empty Transfer-encoding header
  BUG/MEDIUM: spoe: Be sure to create a SPOE applet if none on the current 
thread

Frederic Lecaille (1):
  BUG/MINOR: quic: Wrong datagram building when probing.

Ilia Shipitsin (2):
  CI: add weekly QUIC Interop regression against LibreSSL
  CI: weekly QUIC Interop: try to fix private image

Valentine Krasnobaeva (3):
  MEDIUM: init: set default for fd_hard_limit via DEFAULT_MAXFD
  DOC: configuration: update maxconn description
  BUG/MEDIUM: init: fix fd_hard_limit default in compute_ideal_maxconn

William Lallemand (5):
  DOC: configuration: add details about crt-store in bind "crt" keyword
  DOC: configuration: more details about the master-worker mode
  BUG/MINOR: jwt: don't try to load files with HMAC algorithm
  BUG/MINOR: jwt: fix variable initialisation
  MINOR: ssl/sample: ssl_c_san returns a comma separated list of SAN

Willy Tarreau (6):
  BUG/MEDIUM: quic: fix possible exit from qc_check_dcid() without unlocking
  OPTIM: pool: improve needed_avg cache line access pattern
  MAJOR: import: update mt_list to support exponential back-off (try #2)
  BUILD: listener: silence a build warning about unused value without 
threads
  DOC: architecture: remove the totally outdated architecture manual
  SCRIPTS: create-release: no more need to skip architecture.txt

---



Re: [PATCH] CI: weekly QUIC Interop: try to fix private image

2024-07-10 Thread Willy Tarreau
On Tue, Jul 09, 2024 at 03:03:49PM +0200, Ilia Shipitsin wrote:
> for some reason image built in HAProxy workflow is "private", it
> is succesfully built, but fails to pull. Let's try explicit docker login
> for run job as well

Merged, thanks Ilya!
willy



Re: [PATCH] BUG/MINOR: tools: Ensure the generated UUIDv7 is monotonic

2024-07-09 Thread Willy Tarreau
Hi Tim!

On Tue, Jul 09, 2024 at 07:34:57PM +0200, Tim Düsterhus wrote:
> Hi
> 
> On 7/8/24 06:38, Willy Tarreau wrote:
> > Well, I must confess that while I understand what you tried to do, it's
> > less clear what is the problem you wanted to address. I think that your
> > concern is the possible lack of monotonicity, is that it ?
> 
> An example is probably best to explain the requirements. When I do the
> following:
> 
>   http-after-response set-header test1 %[uuid(7)]
>   http-after-response set-header test2 %[uuid(7)]
>   http-after-response set-header test3 %[uuid(7)]
>   http-after-response set-header test4 %[uuid(7)]
>   http-after-response set-header test5 %[uuid(7)]
>   http-after-response set-header test6 %[uuid(7)]
>   http-after-response set-header test7 %[uuid(7)]
>   http-after-response set-header test8 %[uuid(7)]
> 
> then the returned UUIDs need to be "ascending" as per RFC 9562#6.2. The
> current implementation does not guarantee that, because the embedded
> timestamp only has millisecond resolution and the rest is completely random.
> That means that the test6 UUID might be lower than the test2 UUID.

I initially read that the monotonicity was only required for the clock
part (because that's what initial implementations were seeking: being
able to keep caches hot and batch-purge elements without searching), but
it indeed seems they went a bit too far by requiring full monotonicity.
The problem is that it automatically excludes multi-node. There's even a
section (6.4) about multi-node that carefully avoids discussing anything
related to monotonicity since that's a well-known physically unsolvable
problem, it only discusses collision avoidance.

> RFC 9562 provides several options in section 6.2. Either the bits following
> the millisecond part are a simple counter that resets every millisecond,
> they are used to store addition clock precision (e.g. micro or nanoseconds),
> or the random part is not actually random and instead starts at a random
> value and is incremented by a random amount every time.

These ones are indeed solutions for a single-node generator. Normally,
by just starting a new random each new time and incrementing till next
time, you'll play in pseudo-random ranges and will not get any local
collision. But the problem in a distributed system is that the most
synchronized your clocks will be, the most likely a collision will
happen, since two nodes picking a relatively close number can easily
reach the other node's starting point, versus pure randoms that will
not do that (but are not monotonic).

> My patch chooses option (2): Storing additional clock precision.

The problem is that on many systems, often VMs but also various devices
which are typically used to host components such as haproxy, this extra
precision does not exist, or relies on non-monotonic sources, resulting
in the clock_gettime() call effectively returning discrete values just
differing by a counter. For example here on this system:

  $ ./a.out  | tail
  0002507749.074553230
  0002507749.074557730
  0002507749.074562230
  0002507749.074566730
  0002507749.074571230
  0002507749.074575730
  0002507749.074580230
  0002507749.074584730
  0002507749.074589230
  0002507749.074593730

See ? You're in fact having a fixed 500 ns counter. It's not that bad
because at least it changes, but on other setups you'd see something
stable.

The second problem with CLOCK_MONOTONIC is that its base is unspecified
and often is the system's uptime, like above. In short-lived VMs, you'll
be generating cycling UUIDs all starting with many zeroes and concentrated
on a tiny range. It's even worse in containers which can live even shorter
(a few seconds). In all such environments, CLOCK_MONOTONIC is a no-go.

> > In any case there are much less risks of collisions by filling bits
> > with random than with clock bits. Here if I read correctly, you're
> > only keeping 64 bits of random, which means that on average you'll
> > experience a collision every 2^32 generations. At 50k/s, it happens
> > once a day. With the 96 bits used previously, it will only happen
> > every 2^48, which is once every 178 years (which obviously reduces
> > in distributed systems).
> 
> My patch does the following:
> 
> 48 bits: Unix timestamp with milliseconds.
>  4 bits: UUID version.
> 12 bits: How many 256ns periods happened since the last millisecond?
>  2 bits: UUID variant.
> 62 bits: Random.
> 
> Indeed a collision with 64 bits of randomness is expected every 2^32 values,
> but you need to keep in mind that a collision will only be a collision if it
> happens in the same millisecond + nanosecond offset. So it actually is a
> comparison between:
> 
> 

Re: [PATCH] BUG/MINOR: tools: Ensure the generated UUIDv7 is monotonic

2024-07-07 Thread Willy Tarreau
Hi Tim,

On Sat, Jul 06, 2024 at 09:13:12PM +0200, Tim Duesterhus wrote:
> Willy,
> 
> I'm not particularly happy with out the patch turned out, but this is the
> best I could come up with. If you are also unhappy with it, then please
> consider this email to be a bug report and adjust my patch as necessary /
> fix it yourself. Unfortunately I don't currently have the time to spend
> further time on this. Of course if you are happy, then feel free to apply :-)
> 
> Best regards
> Tim Düsterhus
> 
> Apply with `git am --scissors` to automatically cut the commit message.
> 
> -- >8 --
> RFC 9562 specifies specifies that the top-most 48 bits of a UUIDv7 represent a
> unix timestamp with millisecond resolution, allowing a user to extract the
> timestamp. The remaining bits (except for version and variant) are left
> unspecified. However it still requires that generated UUIDv7 are monotonic.
> 
> Section 6.2 describes several methods to generate monotonic UUIDs. This patch
> employs "Method 3", filling the left-most bits "rnd_a" with increased clock
> precision, guaranteeing monotonicity up to a 244ns resolution.
> 
> UUIDv7 was added with HAProxy 3.0, this patch should be backported there.

Well, I must confess that while I understand what you tried to do, it's
less clear what is the problem you wanted to address. I think that your
concern is the possible lack of monotonicity, is that it ?

If so we already have a monotonic date (global_now_ns), it's just that the
offset applied at boot is lost. By keeping it we now have a monotonic date
across all threads (see attached patch).

Also there's no point increasing time precision, it's the opposite that
must be done when looking for monotonicity, even if that sounds counter
intuitive. The reason is that system clocks are not precise and that by
feeding more bits from time into the output, you end up preventing that
output from changing between multiple calls, while instead it could
contain a counter or a random. For example in VMs and some embedded
systems it's not rare at all to see the clock source set to "jiffies"
since the other ones are either totally unreliable or horribly slow
(e.g. emulating hardware), resulting in 1, 4 or 10ms time resolution.
And on systems not doing this you can see apparently good precision
that doesn't change between multiple calls, for various reasons, or
even time jumping backwards when using TSC on multi-processor systems
or VMs.

In any case there are much less risks of collisions by filling bits
with random than with clock bits. Here if I read correctly, you're
only keeping 64 bits of random, which means that on average you'll
experience a collision every 2^32 generations. At 50k/s, it happens
once a day. With the 96 bits used previously, it will only happen
every 2^48, which is once every 178 years (which obviously reduces
in distributed systems).

Finally, the purpose of having the timestamp in UUIDv7 (like ULID)
is to group simultaneous operations. They keep caches hot, they allow
events to be found grouped in logs, and offer easier purge mechanisms
for recycling entries (no longer need to look up, just apply a time
offset and eliminate all those below). In that sense there's no quest
for extreme precision (which cannot work on distributed systems anyway),
rather for a reasonably good and reasonably monotonic ordering that
still prevents collisions across multiple systems.

So if that's also what you're looking for, I'm proposing that we
instead change the use of the unreliable date variable with a call
to now_mono_date_ns() that we can then use as the date source, and
keep the rest of the randoms.

Just let me know what you think.

Thanks!
Willy
>From 8952b22c9209ea9c2c82bd645141beec968873a9 Mon Sep 17 00:00:00 2001
From: Willy Tarreau 
Date: Mon, 8 Jul 2024 06:01:59 +0200
Subject: MINOR: clock: provide a new monotonic date function

The global_now_ns variable is derived from the system date, corrected
for monotonicity and with an offset applied so that global_now_ms wraps
soon after boot.

A monotonic date has plenty of use cases (such as generating UUIDv7),
but for the sake of avoiding wrapping between process restarts, we must
drop the boot offset.

This commit adds a variable now_offset_boot that contains a copy of the
initial offset applied to global_now_ns, and a function now_mono_date_ns()
that subtracts it from global_now_ns so that the resulting value remains
as close as possible to system time more or less drift corrections.
---
 include/haproxy/clock.h |  2 ++
 src/clock.c | 12 
 2 files changed, 14 insertions(+)

diff --git a/include/haproxy/clock.h b/include/haproxy/clock.h
index 264363e27b..1064d430b0 100644
--- a/include/haproxy/clock.h
+++ b/include/haproxy/clock.h
@@ -28,6 +28,7 @@
 extern struct timeva

Re: [PATCH 1/1] CI: add weekly QUIC Interop regression against LibreSSL

2024-07-05 Thread Willy Tarreau
On Thu, Jul 04, 2024 at 04:26:34PM +0200, Ilia Shipitsin wrote:
> currently only quic-go and picoquic clients are enabled with testsuites
> supposed to be "green". Tests will be run weekly.

Now applied, let's give it a try. Thank you Ilya!
Willy



[ANNOUNCE] haproxy-3.1-dev2

2024-06-29 Thread Willy Tarreau
ment the ref count inside lock to kill a 
session
  MINOR: stick-table: Always decrement ref count before killing a session

Frederic Lecaille (1):
  BUILD: Missing inclusion header for ssize_t type

Valentine Krasnobaeva (9):
  MINOR: capabilities: export capget and __user_cap_header_struct
  MINOR: capabilities: prepare support for version 3
  MINOR: capabilities: use _LINUX_CAPABILITY_VERSION_3
  MINOR: cli/debug: show dev: add cmdline and version
  MINOR: cli/debug: show dev: show capabilities
  REORG: init: do MODE_CHECK_CONDITION logic first
  REORG: init: encapsulate CHECK_CONDITION logic in a func
  REORG: init: encapsulate 'reload' sockpair and master CLI listeners 
creation
  REORG: init: encapsulate code that reads cfg files

William Lallemand (6):
  REGTESTS: ssl: fix some regtests 'feature cmd' start condition
  BUG/MEDIUM: ssl: AWS-LC + TLSv1.3 won't do ECDSA in RSA+ECDSA 
configuration
  MINOR: ssl: activate sigalgs feature for AWS-LC
  REGTESTS: ssl: activate new SSL reg-tests with AWS-LC
      DOC: configuration: fix alphabetical order of bind options
  MINOR: sample: date converter takes HTTP date and output an UNIX timestamp

Willy Tarreau (4):
  DEV: flags/show-fd-to-flags: adapt to recent versions
  MINOR: debug: print gdb hints when crashing
  BUILD: debug: also declare strlen() in __ABORT_NOW()
  MINOR: activity: make the memory profiling hash size configurable at 
build time

---



Re: [ANNOUNCE] haproxy-3.1-dev1

2024-06-14 Thread Willy Tarreau
On Fri, Jun 14, 2024 at 04:12:03PM +0200, Christopher Faulet wrote:
> Hi,
> 
> HAProxy 3.1-dev1 was released on 2024/06/14. It added 95 new commits
> after version 3.1-dev0.
> 
> Because Willy announced publicly I should managed a -dev1 before his return
> from vacations, I have no choice. So, faced with so much pressure, here is
> the first 3.1 dev release.

Hehe, thank you!

Willy



Re: [PATCH 1/1] CI: VTest: accelerate package install a bit

2024-05-30 Thread Willy Tarreau
On Thu, May 30, 2024 at 04:12:02PM +0200, William Lallemand wrote:
> On Thu, May 30, 2024 at 03:40:31PM +0200, Ilia Shipitsin wrote:
> > Subject: [PATCH 1/1] CI: VTest: accelerate package install a bit
> > let's check and install only package is required
> > ---
> >  .github/workflows/vtest.yml | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/.github/workflows/vtest.yml b/.github/workflows/vtest.yml
> > index f862dc5a7..5208df757 100644
> > --- a/.github/workflows/vtest.yml
> > +++ b/.github/workflows/vtest.yml
> > @@ -82,10 +82,10 @@ jobs:
> >run: |
> >  sudo apt-get update
> >  sudo apt-get --no-install-recommends -y install \
> > -  liblua5.4-dev \
> > -  libpcre2-dev \
> > -  libsystemd-dev \
> > -  ninja-build \
> > +  ${{ contains(matrix.FLAGS, 'USE_LUA=1') && 'liblua5.4-dev'  
> > || '' }} \
> > +  ${{ contains(matrix.FLAGS, 'USE_PCRE2=1')   && 'libpcre2-dev'   
> > || '' }} \
> > +  ${{ contains(matrix.FLAGS, 'USE_SYSTEMD')   && 'libsystemd-dev' 
> > || '' }} \
> 
> In fact libsystemd-dev is not required anymore, we have our own sd_notify in 
> haproxy.

I thought we still needed it for the .h but apparently not, indeed.

Willy



Re: [PATCH 1/1] CI: VTest: accelerate package install a bit

2024-05-30 Thread Willy Tarreau
Hi Ilya,

On Thu, May 30, 2024 at 03:40:31PM +0200, Ilia Shipitsin wrote:
> +  ${{ contains(matrix.FLAGS, 'USE_LUA=1') && 'liblua5.4-dev'  || 
> '' }} \
> +  ${{ contains(matrix.FLAGS, 'USE_PCRE2=1')   && 'libpcre2-dev'   || 
> '' }} \
> +  ${{ contains(matrix.FLAGS, 'USE_SYSTEMD')   && 'libsystemd-dev' || 
> '' }} \
   ^^^ =1 ?
while I guess it should work, is this an overlook or is it intentional
that "=1" is not specified here ?

Thanks,
Willy



Re: [PATCH] FEATURE: Adding MPTCP with option to disable it and fall-back to TCP

2024-05-30 Thread Willy Tarreau
Hi Matthieu,

finally a bit more available again...

On Fri, Apr 26, 2024 at 06:34:02PM +0200, Matthieu Baerts wrote:
> > I *am* interested in the feature, which has been
> > floating around for a few years already. However I tend to agree with
> > Nicolas that, at least for the principle of least surprise, I'd rather
> > not have this turned on by default. There might even be security
> > implications that some are not willing to take yet, and forcing them
> > to guess that they need to opt out of something is not great in general
> > (it might change in the future though, once everyone turns it on by
> > default, like we did for keep-alive, threads, and h2 for example).
> 
> It makes sense as well! I initially suggested to Dorian to first send a
> patch where MPTCP is enabled by default because of the low impact (and
> also after having seen an old comment on that topic [1]). But indeed,
> doing that in steps sounds safer.
> 
> [1] https://github.com/haproxy/haproxy/issues/1028#issuecomment-754560146

Yeah I understand, it would essentially depend on how all of this evolves
over time and how adoption grows for mptcp. Maybe one day most users will
expect it to work by default.

> > I'm also concerned by the change at the socket layer that will make all
> > new sockets cause two syscalls on systems where this is not enabled,
> 
> I thought the listening sockets were created only once at startup and
> having two syscalls would have been fine. I guess it should be easy to
> remember the failure the first time, to avoid the extra syscalls for the
> next listening sockets.

What you say is true for the listeners, but the socket() call is also
used when connecting to a backend server (though maybe I missed something
during the review).

> > and I'd really really prefer that we use the extended syntax for
> > addresses that offers a lot of flexibility. We can definitely have
> > "mptcp@address" and probably mptcp as a variant of tcp.
> 
> >From what I understood, Dorian is now looking at that. It is not clear
> if he should create a new proto (struct protocol) or modifying it after
> having called protocol_lookup() in str2sa_range().

I *tend* to think that a new struct protocol is easier to add and
maintain. It could just share most (if not all) of the functions
with tcp, and probably be declared in the same file for ease of
sharing code.

I'm seeing that in the comment you linked above I also proposed a
keyword for "bind" and "server" lines, like we have "tfo" to enable
TCP fast-open. It tends to be slightly easier to implement but then
requires more care everywhere because such options do no apply to
all protocols, so if you have "mptcp on" on a "bind quic4@:443" line,
that's quite confusing so it deserves extra checks to make sure it
is not silently ignored. Also I do have some doubts about how to
retrieve the source address, maybe we'll find that in fact it should
be seen as a new address family and not a new transport layer. But I
think not at this point. And BTW Björn had apparently implemented a
solution based on mptcp@ as well so it's likely workable.

> Can MPTCP be used as a variant of "rhttp" as well?

I don't see how it should be related. The rhttp part is always tricky,
but the concept is that we accept a regular connection on a regular
frontend, then based on an explicit rule, return it and assign it as
an idle conn to a server. And for the server setting up pre-connected
idle conns to the gateway, it's just a regular address as well and the
protocol should not be involved there.

> > Regarding how
> > we'd deal with the fallback, we'll see, but overall, thinking that
> > would explicitly configure mptcp and not even get an error if
> > it cannot bind is problematic to me, because among the most common
> > causes of trouble are things that everyone ignores (module not loaded,
> > missing secondary IP, firewall rules etc) that usually cause problems.
> > Those we can discover at boot time should definitely be reported.
> 
> With the v1 Dorian suggested where MPTCP is enabled by default, a
> warning is reported if it is not possible to create an MPTCP socket, and
> a "plain" TCP one has been created instead.
> 
> If MPTCP is explicitly asked, I guess it would make sense to fail if it
> is not possible to create an MPTCP socket, no?

Yes I agree. I anticipate that if the solution eventually becomes
successful, some users will then want a 3-state setting: always-on,
always-off, best-effort. But let's not needlessly complexify the design
for now.

> > But
> > let's discuss this in early June instead (I mean, feel free to discuss
> > the points together, but I'm not going to invest more time on this at
> > this moment).
> 
> Thank you!
> 
> Please note that Dorian is doing this as part of a work with his uni
> (useful contributions to Open-Source), and I think he would prefer
> sending the v2 before June, if that's OK. I can look at rebasing it
> later if needed. If there is no need to implement a fallb

Re: [PATCH v2] FEATURE: add opt-in MPTCP support

2024-05-30 Thread Willy Tarreau
Hi Dorian,

I'm now done with the release and having more time to read your
work. First, thanks for this update. I understand that you're almost
running out of time on this topic which must be completed before
June so I'm not going to make you waste your time. Some comments
below.

On Thu, May 16, 2024 at 03:53:40PM +0200, Dorian Craps wrote:
> From: Dorian Craps 
> 
> Multipath TCP (MPTCP), standardized in RFC8684 [1], is a TCP extension
> that enables a TCP connection to use different paths.
> 
> Multipath TCP has been used for several use cases. On smartphones, MPTCP
> enables seamless handovers between cellular and Wi-Fi networks while
> preserving established connections. This use-case is what pushed Apple
> to use MPTCP since 2013 in multiple applications [2]. On dual-stack
> hosts, Multipath TCP enables the TCP connection to automatically use the
> best performing path, either IPv4 or IPv6. If one path fails, MPTCP
> automatically uses the other path.
> 
> To benefit from MPTCP, both the client and the server have to support
> it. Multipath TCP is a backward-compatible TCP extension that is enabled
> by default on recent Linux distributions (Debian, Ubuntu, Redhat, ...).
> Multipath TCP is included in the Linux kernel since version 5.6 [3]. To
> use it on Linux, an application must explicitly enable it when creating
> the socket. No need to change anything else in the application.
> 
> This attached patch adds the "mptcp" global option in the config, which
> allows the creation of an MPTCP socket instead of TCP on Linux. If
> Multipath TCP is not supported on the system, an error will be reported,
> and the application will stop.
> 
> A test has been added, it is a copy of "default_rules.vtc" in tcp-rules
> with the addition of "mptcp" in the config. I'm not sure what else needs
> to be tested for the moment, with this global MPTCP option.
> 
> Note: another patch is coming to support enabling MPTCP per address, but
> I prefer to already send this patch, just in case, as I will soon have
> less time to dedicate to this.

Thanks for this. As I previously mentioned, I'm not going to merge a
patch with a global option that does all or nothing, however having a
working patch like this can definitely serve as a proof-of-concept and
a reference when testing a more granular approach, and in this regard
your patch is much welcome!

> Due to the limited impact within a data center environment,
> we have decided not to implement MPTCP between the proxy and the servers.
> The high-speed, low-latency nature of data center networks reduces
> the benefits of MPTCP, making the complexity of its implementation
> unnecessary in this context.

I definitely understand. However let's keep in mind that whenever we
envision a use case, someone will come with a different one and suggest
that we should support it. Here I can imagine that some CDN operators
might be interested in using MPTCP to the origin server as well by
connecting via multiple paths. This opens a big can of worms with
questions about source address binding, interface binding etc, which
cannot be addressed easily as they'll all have impacts on the way
servers are configured. So I'm totally fine with having only a frontend
support in a first time. I just mentioned this to give you more context
and so that you know it's not only the DC that's on the backend, even
if it's by far the most common, and what some of the complex aspects
can be.

I'll be in vacation for two weeks at the end of this week, will you need
some review of a possible next patch, or did you give up due to some
difficulties you faced while exploring the per-listener configuration ?

I'll give you a few comments on the patch below:

> diff --git a/src/cfgparse-global.c b/src/cfgparse-global.c
> index b173511c9..0feccd4b2 100644
> --- a/src/cfgparse-global.c
> +++ b/src/cfgparse-global.c
> @@ -23,6 +23,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  
>  int cluster_secret_isset;
> @@ -52,7 +54,7 @@ static const char *common_kw_list[] = {
>   "presetenv", "unsetenv", "resetenv", "strict-limits", "localpeer",
>   "numa-cpu-mapping", "defaults", "listen", "frontend", "backend",
>   "peers", "resolvers", "cluster-secret", "no-quic", "limited-quic",
> - NULL /* must be last */
> + "mptcp", NULL /* must be last */
>  };

Style only: better leave the NULL alone on its line so that it remains
distinct from the rest.

>  /*
> @@ -1334,6 +1336,23 @@ int cfg_parse_global(const char *file, int linenum, 
> char **args, int kwm)
>   HA_ATOMIC_STORE(&global.anon_key, tmp);
>   }
>   }
> + else if (strcmp(args[0], "mptcp") == 0) {
> + if (alertif_too_many_args(0, file, linenum, args, &err_code))
> + goto out;
> +#ifdef __linux__
> +#ifndef IPPROTO_MPTCP
> +#define IPPROTO_MPTCP 262
> +#endif

This would be better placed in include/haproxy/compat.h (there are
already other ones th

Re: [PATCH 0/3] CI: preparation for Ubuntu 24.04

2024-05-29 Thread Willy Tarreau
On Wed, May 29, 2024 at 09:59:13PM +0200, Ilia Shipitsin wrote:
> GitHub has launched Ubuntu 24.04 runners in beta. 
> While runners are not yet stable, switching to them
> has shown some inconsistance in pipeline which is better
> to be resolved before actual upgrade to Ubuntu 24.04
> 
> Ilia Shipitsin (3):
>   CI: use "--no-install-recommends" for apt-get
>   CI: switch to lua 5.4
>   CI: use USE_PCRE2 instead of USE_PCRE

Whole series applied, thank you Ilya!
Willy



Re: [PATCH 1/2] REGTESTS: Remove REQUIRE_VERSION=2.1 from all tests

2024-05-29 Thread Willy Tarreau
On Wed, May 29, 2024 at 07:55:32PM +0200, Tim Duesterhus wrote:
> HAProxy 2.2 is the lowest supported version, thus this always matches.

(...)

Both patches applied, thank you Tim!
Willy



Re: [ANNOUNCE] haproxy-3.0.0

2024-05-29 Thread Willy Tarreau
Hi Tim,

On Wed, May 29, 2024 at 07:48:10PM +0200, Tim Düsterhus wrote:
> Hi
> 
> On 5/29/24 17:07, Willy Tarreau wrote:
> > HAProxy 3.0.0 was released on 2024/05/29.
> 
> Congratulations on the successful release!

Thanks!

> I've just opened a PR for the "Official Docker Images" to add HAProxy 3.1:
> https://github.com/docker-library/haproxy/pull/234

OK!

> And of course it wouldn't be a real release, without me pointing out some
> things here and there :-)

It's because you stress me :-)

> - I've made a little change for 2.9 on the landing page of the docs:
> https://github.com/haproxy/docs/commit/94ec86972c6676b132954c0b4d629ca4287ae449

Yes, I saw that, thank you!

> - The version table on haproxy.org still has the EOL column for 2.0 in bold.
> Other EOL versions are not bold, so that's inconsistent.

Ah, that makes sense, you're right. Now fixed!

> Enjoy your vacation!

I'll try ;-)

Willy



[ANNOUNCE] haproxy-3.0.0

2024-05-29 Thread Willy Tarreau
 counters are also dumped if requested now (frontend, backend,
listener, server).

I'm fairly certain that I forgot a few things. As usual, I'm told that my
coworkers at HAProxyTech also went through this tedious task of enumerating
the changes, and it will be posted soon here:

  https://www.haproxy.com/blog/announcing-haproxy-3-0

My understanding is that there will be some followups with a focus on
selected points. I'm not surprised by the difficulty of the exercise
this time ;-)

For this version, we've got an increased help from various testers who
accepted to run one (or a few) servers with the development version, and
who were able to report a few problems with accurate version ranges, as
well as traces and info that permitted to fix the issues quickly. It
worked amazingly well and allowed us to address some nasty bugs that are
fairly hard to reproduce and that were present for several versions
already. At the risk of repeating myself, thanks for that! I know that
operating a -dev version requires a bit more involvement than a stable one
but it's also a win-win: when something doesn't please you, it's not too
late to suggest a change, and you can benefit from the latest debugging
features and performance improvements. I sincerely hope that this success
will encourage other users into that direction. The nice benefit for the
user of facing a bug in -dev vs -stable is that we have no problem
developing new debugging extensions just for that issue, so a git pull is
enough to suddenly make the problem much more observable and require less
amount of work to filter data than with a stable version. And something
that's human is that developers tend to be much more attracted by issues
affecting areas that are still fresh in their heads and will tend to treat
them with higher priority.

I also noticed more exchanges from various participants on the issues
and here on the list, so big thanks as well to those who take time to
review other users' problem reports and requests for help. Especially
for first-time reporters, it gives them a great experience of the
project and its community.

As usual with a new major release comes the death of an old one. This time
it's 2.0 that passed away after 5 years serving as a transition between
the old legacy versions and the newer HTX-enabled ones. I'm fairly sure
there are still some here and there, so please consider this as a reminder
that it's about time to upgrade. And 2.4 turned to critical fixes only
status.

On a side note (not very funny but surprising), apparently there was a big
GitHub outage last night, and this morning we're getting a "Ooops 500" page
on the haproxy repository there: https://github.com/haproxy/haproxy
The issues seem to be working, the wiki and docs projects as well. So I
suspect that an error page got cached during the outage and continues to
be delivered for whetever reason. I opened a ticket to their support and
we'll see when we get a response. Fortunately we're not completely blocked,
but it feels strange to release on a day of outage. After all, that's a
form of resilience that also makes one use a load balancer, so there's
some logic there.

Speaking of resilience, I'm going to take a bit of vacation next week and
the week after (maybe I should have postponed given the heavy rain here),
but you're in good hands with the rest of the team, and Christopher is
back on Monday, fresh an in full force. Maybe you'll even manage to
convince him to emit -dev1 himself, who knows :-)

Please find the usual URLs below :
   Site index   : https://www.haproxy.org/
   Documentation: https://docs.haproxy.org/
   Wiki : https://github.com/haproxy/wiki/wiki
   Discourse: https://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Sources  : https://www.haproxy.org/download/3.0/src/
   Git repository   : https://git.haproxy.org/git/haproxy-3.0.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy-3.0.git
   Changelog: https://www.haproxy.org/download/3.0/src/CHANGELOG
   Dataplane API: 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs: https://www.haproxy.org/l/reviewed-bugs
   Code reports : https://www.haproxy.org/l/code-reports
   Latest builds: https://www.haproxy.org/l/dev-packages

I verified what I had in mind for 3.0 and 3.1-dev0 (that just opened),
and I think all is good (Tim already fixed an incorrect color on the
docs index). As usual, if (should I say when?) you detect a broken link,
just let me know so I can fix it.

Have fun!
Willy
---
Complete changelog from 3.0-dev13:
Amaury Denoyelle (2):
  DOC: streamline http-reuse and connection naming definition
  REGTESTS: complete http-reuse

[ANNOUNCE] haproxy-3.0-dev13

2024-05-24 Thread Willy Tarreau
code is only changed if a significant regression is found (i.e.
we avoid last-minute breakage). Also, I know well enough that it's
sufficient to say that a version is the last -dev for about everyone to
skip it and wait for the final one! That's one more reason for not waiting
too long after it and not modifying it too much.

Please find the usual URLs below :
   Site index   : https://www.haproxy.org/
   Documentation: https://docs.haproxy.org/
   Wiki : https://github.com/haproxy/wiki/wiki
   Discourse: https://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Sources  : https://www.haproxy.org/download/3.0/src/
   Git repository   : https://git.haproxy.org/git/haproxy.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy.git
   Changelog: https://www.haproxy.org/download/3.0/src/CHANGELOG
   Dataplane API: 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs: https://www.haproxy.org/l/reviewed-bugs
   Code reports : https://www.haproxy.org/l/code-reports
   Latest builds: https://www.haproxy.org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (20):
  BUG/MINOR: connection: parse PROXY TLV for LOCAL mode
  BUG/MINOR: server: free PROXY v2 TLVs on srv drop
  MINOR: rhttp: add log on connection allocation failure
  BUG/MEDIUM: rhttp: fix preconnect on single-thread
  BUG/MINOR: rhttp: prevent listener suspend
  BUG/MINOR: rhttp: fix task_wakeup state
  MINOR: session: define flag to explicitely release listener on free
  MEDIUM: rhttp: create session for active preconnect
  MINOR: rhttp: support PROXY emission on preconnect
  MINOR: connection: support PROXY v2 TLV emission without stream
  BUILD: trace: fix warning on null dereference
  MEDIUM: config: prevent communication with privileged ports
  MAJOR: config: prevent QUIC with clients privileged port by default
  BUG/MINOR: quic: adjust restriction for stateless reset emission
  MINOR: quic: clarify doc for quic_recv()
  MINOR: server: generalize sni expr parsing
  MINOR: server: define pool-conn-name keyword
  MEDIUM: connection: use pool-conn-name instead of sni on reuse
  BUG/MINOR: rhttp: initialize session origin after preconnect reversal
  DOC: quic: specify that connection migration is not supported

Aurelien DARRAGON (11):
  BUG/MINOR: ring: free ring's allocated area not ring's usable area when 
using maps
  DEBUG: tools: add vma_set_name() helper
  DEBUG: shctx: name shared memory using vma_set_name()
  DEBUG: sink: add name hint for memory area used by memory-backed sinks
  DEBUG: pollers: add name hint for large memory areas used by pollers
  DEBUG: errors: add name hint for startup-logs memory area
  DEBUG: fd: add name hint for large memory areas
  CLEANUP: tools: fix vma_set_name() function comment
  DEBUG: tools: add vma_set_name_id() helper
  DEBUG: pollers/fd: add thread id suffix to per-thread memory areas name 
hints
  BUG/MEDIUM: server/dns: preserve server's port upon resolution timeout or 
error

Christopher Faulet (8):
  BUG/MINOR: http-ana: Don't crush stream termination condition on internal 
error
  MAJOR: spoe: Let the SPOE back into the game
  BUG/MEDIUM: mux-quic: Create sedesc in same time of the QUIC stream
  MINOR: mux-quic: Set abort info for SC-less QCS on STOP_SENDING frame
  BUG/MEDIUM: stick-tables: Fix race with peers when trashing oldest entries
  BUG/MEDIUM: stick-tables: Fix race with peers when killing a sticky 
session
  BUG/MINOR: http-htx: Support default path during scheme based 
normalization
  BUG/MINOR: server: Don't reset resolver options on a new default-server 
line

Frederic Lecaille (1):
  BUG/MAJOR: quic: Crash with TLS_AES_128_CCM_SHA256 (libressl only)

Ilia Shipitsin (1):
  CI: scripts/build-ssl.sh: loudly fail on unsupported platforms

Valentine Krasnobaeva (4):
  BUG/MEDIUM: proto: fix fd leak in _connect_server
  MINOR: sock: set conn->err_code in case of EPERM
  BUG/MINOR: sock: fix sock_create_server_socket
  MINOR: proto: fix coding style

William Lallemand (9):
  CLEANUP: ssl/cli: remove unused code in dump_crtlist_conf
  MINOR: ssl: check parameter in ckch_conf_cmp()
  DOC: configuration: rework the crt-store load documentation
  MEDIUM: ssl: don't load file by discovering them in crt-store
  DOC: configuration: update the crt-list documentation
  DOC: configuration: add the supported crt-store options in crt-list
  REGTESTS: scripts: allow to change the vtest timeout
  CI: scripts/build-ssl: add a DESTDIR and TMPDIR variable
  CI: scripts/buil-ssl: cleanup the boringssl and quictls bui

Re: [PATCH v2] MINOR: config: rhttp: Don't require SSL when attach-srv name parsing

2024-05-24 Thread Willy Tarreau
On Thu, May 23, 2024 at 03:58:45PM +0100, William Manley wrote:
> I can also report that I no longer need to avoid `nbthread 1` in the config
> on the node.  Presumably thanks to ceebb09744df367ad84586a341d9336f84f72bce
> "rhttp: fix preconnect on single-thread".

BTW keep in mind that connections may only be shared between threads
when they're idle; a connection with a request in flight on it will be
pinned to the thread that sent this request, and only other requests
coming from the same thread will be able to share this connection.

This means that you need to have enough pre-established connections to
be sure that in no case you have a thread that ends up with no available
connection. As such, if you want to keep the number of connections low,
it can make sense to limit the number of threads.

Willy



[ANNOUNCE] haproxy-3.0-dev12

2024-05-18 Thread Willy Tarreau
r ckch_conf
  MEDIUM: ssl: add ocsp-update.disable global option
  MEDIUM: ssl/cli: handle crt-store keywords in crt-list over the CLI
  MINOR: ssl: ckch_conf_cmp() compare multiple ckch_conf structures
  MEDIUM: ssl: temporarily load files by detecting their presence in 
crt-store
  REGTESTS: ocsp-update: change the reg-test to support the new crt-store 
mode

William Manley (1):
  MINOR: rhttp: Don't require SSL when attach-srv name parsing

Willy Tarreau (9):
  BUG/MEDIUM: htx: mark htx_sl as packed since it may be realigned
  BUG/MEDIUM: stick-tables: properly mark stktable_data as packed
  SCRIPTS: run-regtests: fix a few occurrences of extended regexes
  BUG/MINOR: ssl_sock: fix xprt_set_used() to properly clear the 
TASK_F_USR1 bit
  MINOR: dynbuf: provide a b_dequeue() variant for multi-thread
  BUG/MEDIUM: muxes: enforce buf_wait check in takeover()
  BUILD: stick-tables: better mark the stktable_data as 32-bit aligned
  CLEANUP: compat: make the MIN/MAX macros more reliable
  Revert: MEDIUM: evports: permit to report multiple events at once"

---



Re: [PATCH] DOC: Update UUID references to RFC 9562

2024-05-15 Thread Willy Tarreau
On Sun, May 12, 2024 at 05:08:34PM +0200, Tim Duesterhus wrote:
> When support for UUIDv7 was added in commit
> aab6477b67415c4cc260bba5df359fa2e6f49733
> the specification still was a draft.
> 
> It has since been published as RFC 9562.

Excellent timing ;-)

Now merged, thank you Tim!
Willy



[ANNOUNCE] haproxy-3.0-dev11

2024-05-10 Thread Willy Tarreau
Hi,

HAProxy 3.0-dev11 was released on 2024/05/10. It added 43 new commits
after version 3.0-dev10.

Another very calm release, thanks to two days off here in France,
combined with the return of the sun today :-)

Some 3.0 regressions were fixed, such as the broken logs over TCP
after recent ring changes, and a QUIC crash when sending STOP_SENDING
after the stream above was detached.

The RST_STREAM reason code can finally be forwarded to the server for
client aborts. This addresses the problem a few users were facing with
gRPC where the abort reason was not respected. For now this is purposely
limited to only a few reason codes that are relevant to gRPC so that we
don't ruin the possibility to later extend that to H3 and maybe H1.

We also managed to address the design issue around the buffer_wait code
(the stuff that queues tasks waiting for a buffer on low memory). Now
tasks are queued by criticality depending on what buffer is missing, so
as to guarantee forward progress. It works pretty well for H1 and applets
already so that convinced me that we should finish it instead of dropping
all that. H2 and a few other places still need to be slightly revisited,
and I'd like to also merge a simplification that I experimented with that
allows the scheduler to deal itself with waiters instead of having every
user think about calling offer_buffers() when leaving. For now that part
was put to pause because doing it revealed another recent hard-to-trigger
regression that we'll work on early next week.

Finally, the build on Illumos was fixed, the log format options are now
processed at boot time to improve runtime processing performance, and as
usual, a few cleanups, doc and CI updates were included.

Overall this is getting good and I think we're on the right track for a
release around the end of the month.

Please find the usual URLs below :
   Site index   : https://www.haproxy.org/
   Documentation: https://docs.haproxy.org/
   Wiki : https://github.com/haproxy/wiki/wiki
   Discourse: https://discourse.haproxy.org/
   Slack channel: https://slack.haproxy.org/
   Issue tracker: https://github.com/haproxy/haproxy/issues
   Sources  : https://www.haproxy.org/download/3.0/src/
   Git repository   : https://git.haproxy.org/git/haproxy.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy.git
   Changelog: https://www.haproxy.org/download/3.0/src/CHANGELOG
   Dataplane API: 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs: https://www.haproxy.org/l/reviewed-bugs
   Code reports : https://www.haproxy.org/l/code-reports
   Latest builds: https://www.haproxy.org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (1):
  BUG/MEDIUM: mux-quic: fix crash on STOP_SENDING received without SD

Aurelien DARRAGON (3):
  OPTIM: log: resolve logformat options during postparsing
  BUG/MEDIUM: log/ring: broken syslog octet counting
  DOC: lua: fix filters.txt file location

Christopher Faulet (8):
  MEDIUM: stconn/muxes: Add an abort reason for SE shutdowns on muxes
  MINOR: mux-h2: Set the SE abort reason when a RST_STREAM frame is received
  MEDIUM: mux-h2: Forward h2 client cancellations to h2 servers
  MINOR: mux-quic: Set tha SE abort reason when a STOP_SENDING frame is 
received
  MINOR: stconn: Add samples to retrieve about stream aborts
  MINOR: mux-quic: Add .ctl callback function to get info about a mux 
connection
  MINOR: muxes: Add ctl commands to get info on streams for a connection
  MINOR: connection: Add samples to retrieve info on streams for a 
connection

Ilia Shipitsin (3):
  BUILD: clock: improve check for pthread_getcpuclockid()
  CI: add Illumos scheduled workflow
  CI: netbsd: limit scheduled workflow to parent repo only

Patrick Hemmer (3):
  REGTEST: add tests for acl() sample fetch
  BUG/MINOR: acl: support built-in ACLs with acl() sample
  BUG/MINOR: cfgparse: use curproxy global var from config post validation

Valentine Krasnobaeva (1):
  BUG/MINOR: haproxy: only tid 0 must not sleep if got signal

Willy Tarreau (24):
  MINOR: dynbuf: pass a criticality argument to b_alloc()
  MINOR: dynbuf: add functions to help queue/requeue buffer_wait fields
  MINOR: dynbuf: use the b_queue()/b_requeue() functions everywhere
  MEDIUM: dynbuf: make the buffer_wq an array of list heads
  CLEANUP: tinfo: better align fields in thread_ctx
  MINOR: dynbuf: provide a b_dequeue() function to detach a bw from the 
queue
  MEDIUM: dynbuf: generalize the use of b_dequeue() to detach buffer_wait
  MEDIUM: dynbuf/stream: re-enable queueing upon failed buffer allocation
  MEDIUM: dynbuf/stream: do not allocate the buffers in the callback
  MEDIUM: applet: make appctx_buf_available() only wake the applet up, not 

Re: error HAproxy with Galera Cluster v4

2024-05-10 Thread Willy Tarreau
Hello,

On Fri, May 10, 2024 at 12:00:17PM +, Iglesias Paz, Jaime wrote:
> Hey guys, I have a problem with HAProxy and Galera Cluster v4 MySQL (3 
> nodes). I boot the HAProxy server and it returns the following error:
> 
> may 10 13:48:20 phaproxysql1 haproxy[661]: Proxy stats started.
> may 10 13:48:20 phaproxysql1 haproxy[661]: Proxy stats started.
> may 10 13:48:20 phaproxysql1 haproxy[661]: [NOTICE] 130/134820 (661) : New 
> worker #1 (663) forked
> may 10 13:48:20 phaproxysql1 systemd[1]: Started HAProxy Load Balancer.
> may 10 13:48:20 phaproxysql1 haproxy[663]: [WARNING] 130/134820 (663) : 
> Server galeramanagerprd/nodo1prd is DOWN, reason: Layer7 wrong status, code: 
> 1129, info: "Host 'X' is blocked because of many connection errors; 
> unblock>
> may 10 13:48:21 phaproxysql1 haproxy[663]: [WARNING] 130/134821 (663) : 
> Server galeramanagerprd/nodo2prd is DOWN, reason: Layer7 wrong status, code: 
> 1129, info: "Host 'X' is blocked because of many connection errors; 
> unblock>
> may 10 13:48:21 phaproxysql1 haproxy[663]: [WARNING] 130/134821 (663) : 
> Server galeramanagerprd/nodo3prd is DOWN, reason: Layer7 wrong status, code: 
> 1129, info: "Host '' is blocked because of many connection errors; 
> unblock>
> may 10 13:48:21 phaproxysql1 haproxy[663]: [NOTICE] 130/134821 (663) : 
> haproxy version is 2.2.9-2+deb11u6
> may 10 13:48:21 phaproxysql1 haproxy[663]: [NOTICE] 130/134821 (663) : path 
> to executable is /usr/sbin/haproxy
> may 10 13:48:21 phaproxysql1 haproxy[663]: [ALERT] 130/134821 (663) : proxy 
> 'galeramanagerprd' has no server available!
> 
> The haproxy.cfg configuration file:
> 
> defaults
> log global
> modehttp
> option  httplog
> option  dontlognull
> timeout connect 5000
> timeout client  5
> timeout server  5
> errorfile 400 /etc/haproxy/errors/400.http
> errorfile 403 /etc/haproxy/errors/403.http
> errorfile 408 /etc/haproxy/errors/408.http
> errorfile 500 /etc/haproxy/errors/500.http
> errorfile 502 /etc/haproxy/errors/502.http
> errorfile 503 /etc/haproxy/errors/503.http
> errorfile 504 /etc/haproxy/errors/504.http
> 
> listen galeramanagerprd
> bind *:3306
> balance source
> mode tcp
> #option tcplog
> option tcpka
> option mysql-check user haproxy
> server nodo1prd X:3306 check weight 1
> server nodo2prd X:3306 check weight 1
> server nodo3prd X:3306 check weight 1
> 
> 
> (*) for security I change the IPs to X
> 
> Reviewing the documentation I can't find where the problem may be.

That reminds me of something a long time ago, where there was a limit on
the number of check a mysql server would accept from a same IP address,
and it was necessary to change the setting to unlimited. I don't remember
the details but there was something to do using some insert commands. I
don't know if this is still needed after all these years, but the error
message strongly suggests something like this.

Willy



Re: [PATCH] FEATURE: Adding MPTCP with option to disable it and fall-back to TCP

2024-05-08 Thread Willy Tarreau
On Wed, May 08, 2024 at 01:19:22PM +, Dorian Craps wrote:
> first of all, thank you for your interest.
> 
> I already made a version with an option to enable MPTCP
> -https://github.com/CrapsDorian/haproxy/pull/1
> 
> I'm working on a new version with "mptcp@address" as Willy requested.

OK, thank you Dorian. I'm sorry not to have more time to assign to
review your work at the moment, it's really a matter of bad timing :-/

Willy



Re: [PR] fix show-sess-to-flags.sh cob fd state

2024-05-06 Thread Willy Tarreau
Hi!

On Tue, May 07, 2024 at 02:23:02AM +, PR Bot wrote:
> Author: zhibin.zhu 
> Number of patches: 1
> 
> This is an automated relay of the Github pull request:
>fix show-sess-to-flags.sh cob fd state
(...)

> From 95be08c6f4f382ec1b0e34765d4c1f09ddcdebb6 Mon Sep 17 00:00:00 2001
> From: "zhibin.zhu" 
> Date: Wed, 1 May 2024 18:51:26 +0800
> Subject: [PATCH] fix show-sess-to-flags.sh cob fd state
> 
> ---
>  dev/flags/show-sess-to-flags.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/dev/flags/show-sess-to-flags.sh b/dev/flags/show-sess-to-flags.sh
> index 79003a4f25d8..b352709f3a39 100755
> --- a/dev/flags/show-sess-to-flags.sh
> +++ b/dev/flags/show-sess-to-flags.sh
> @@ -195,7 +195,7 @@ while read -r; do
>  ! [[ "$REPLY" =~ [[:blank:]]h2c.*\.flg=([0-9a-fx]*) ]] || 
> append_flag b.h2c.flg   h2c  "${BASH_REMATCH[1]}"
>  elif [ $ctx = cob ]; then
>  ! [[ "$REPLY" =~ [[:blank:]]flags=([0-9a-fx]*) ]]  || 
> append_flag b.co.flgconn "${BASH_REMATCH[1]}"
> -! [[ "$REPLY" =~ [[:blank:]]fd.state=([0-9a-fx]*) ]]   || 
> append_flag b.co.fd.st  fd   "${BASH_REMATCH[1]}"
> +! [[ "$REPLY" =~ [[:blank:]]fd.state=([0-9a-fx]*) ]]   || 
> append_flag b.co.fd.st  fd   0x"${BASH_REMATCH[1]#0x}"
>  elif [ $ctx = res ]; then
>  ! [[ "$REPLY" =~ [[:blank:]]\(f=([0-9a-fx]*) ]]|| 
> append_flag res.flg chn  "${BASH_REMATCH[1]}"
>  ! [[ "$REPLY" =~ [[:blank:]]an=([0-9a-fx]*) ]] || 
> append_flag res.ana ana  "${BASH_REMATCH[1]}"

Hmm why, what is the problem it tries to solve ? Maybe we don't always
emit the 0x on the output ? Please try to provide a bit of info that
can be added to the commit message.

Thanks!
Willy



Re: [PATCH 1/3] BUILD: illumos: pthread_getcpuclockid is not available

2024-05-05 Thread Willy Tarreau
On Sun, May 05, 2024 at 01:43:33PM +0200,  ??? wrote:
> updated patches.

Cool, thanks, now applied.

> I'll address reorg to "compat.h" a bit later, once it is settled in my head

No worries, I've seen your other comment about the need to include
pthread.h, and this alone would be a good reason for leaving the test
where it is for now.

Willy



Re: [PATCH 1/3] BUILD: illumos: pthread_getcpuclockid is not available

2024-05-05 Thread Willy Tarreau
On Sun, May 05, 2024 at 11:15:24AM +0200,  ??? wrote:
> ??, 5 ??? 2024 ?. ? 10:42, Willy Tarreau :
> 
> > On Sun, May 05, 2024 at 09:12:41AM +0200, Miroslav Zagorac wrote:
> > > On 05. 05. 2024. 08:32, Willy Tarreau wrote:
> > > > On Sun, May 05, 2024 at 07:49:55AM +0200,  ??? wrote:
> > > >> ??, 5 ??? 2024 ?. ? 02:05, Miroslav Zagorac :
> > > >>> I think that this patch is not satisfactory because, for example,
> > Solaris
> > > >>> 11.4.0.0.1.15.0 (from 2018) has _POSIX_TIMERS and
> > _POSIX_THREAD_CPUTIME
> > > >>> defined, but does not have the pthread_getcpuclockid() function;
> > while
> > > >>> solaris
> > > >>> 11.4.42.0.0.111.0 (from 2022) has that function.
> > > >>>
> > > >>
> > > >> I'm trying to build on this vmactions/solaris-vm: Use Solaris in
> > github
> > > >> actions <https://github.com/vmactions/solaris-vm>
> > > >> it does not have pthread_getcpuclockid()
> > > >
> > > > I'm wondering what the point of defining _POSIX_THREAD_CPUTIME can be
> > > > then :-/
> > > >
> > >
> > > The pthread_getcpuclockid() function is declared in the include file
> > > /usr/include/pthread.h.  The only difference between the two "versions"
> > of
> > > Solaris 11.4 is that the newer version has a declaration and the older
> > one
> > > does not.
> > >
> > > However, _POSIX_THREAD_CPUTIME is defined in the /usr/include/unistd.h
> > file as
> > > -1 in the UNIX 03 block of options that are not supported in Solaris
> > 11.4.
> > >
> > > /* Unsupported UNIX 03 options */
> > > #if defined(_XPG6)
> > > ..
> > > #define _POSIX_THREAD_CPUTIME (-1)
> > > ..
> > > #endif
> > >
> > >
> > > An explanation of that definition can be found here:
> > >
> > > https://docs.oracle.com/cd/E88353_01/html/E37842/unistd-3head.html
> > >
> > > "If a symbolic constant is defined with the value -1, the option is not
> > > supported. Headers, data types, and function interfaces required only
> > for the
> > > option need not be supplied. An application that attempts to use anything
> > > associated only with the option is considered to be requiring an
> > extension.
> > (...)
> >
> > Ah excellent, that's quite useful! We're already doing that with
> > _POSIX_TIMERS. So I guess one just needs to try this instead:
> >
> > -#if defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0) &&
> > defined(_POSIX_THREAD_CPUTIME
> > +#if defined(_POSIX_TIMERS) && (_POSIX_TIMERS > 0) &&
> > defined(_POSIX_THREAD_CPUTIME && (_POSIX_THREAD_CPUTIME >= 0)
> >
> 
> that worked (I added closing bracket after second "defined")

Ah yes indeed. Thanks for the test. Do you want to update you patch maybe,
since you can test it ?

Thanks,
Willy



  1   2   3   4   5   6   7   8   9   10   >