[PATCH] MINOR: proxy: add option idle-close-on-response

2022-01-05 Thread William Dauchy
Avoid closing idle connections if a soft stop is in progress.

By default, idle connections will be closed during a soft stop. In some
environments, a client talking to the proxy may have prepared some idle
connections in order to send requests later. If there is no proper retry
on write errors, this can result in errors while haproxy is reloading.
Even though a proper implementation should retry on connection/write
errors, this option was introduced to support back compat with haproxy <
v2.4. Indeed before v2.4, we were waiting for a last request to be able
to add a "connection: close" header and advice the client to close the
connection.

In a real life example, this behavior was seen in AWS using the ALB in
front of a haproxy. The end result was ALB sending 502 during haproxy
reloads.
This patch was tested on haproxy v2.4, with a regular reload on the
process, and a constant trend of requests coming in. Before the patch,
we see regular 502 returned to the client; when activating the option,
the 502 disappear.

This patch should help fixing github issue #1506.
In order to unblock some v2.3 to v2.4 migraton, this patch should be
backported up to v2.4 branch.

Signed-off-by: William Dauchy 
---
 doc/configuration.txt | 21 +
 include/haproxy/proxy-t.h |  2 +-
 src/mux_h1.c  |  3 ++-
 src/proxy.c   |  1 +
 4 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index cd8ab4b72..66571a566 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -3756,6 +3756,7 @@ option tcp-smart-connect (*)  X  -
 X X
 option tcpka  X  X X X
 option tcplog X  X X X
 option transparent   (*)  X  - X X
+option idle-close-on-response(*)  X  X X -
 external-check commandX  - X X
 external-check path   X  - X X
 persist rdp-cookieX  - X X
@@ -9836,6 +9837,26 @@ no option transparent
 "transparent" option of the "bind" keyword.
 
 
+option idle-close-on-response
+no option idle-close-on-response
+  Avoid closing idle connections if a stop stop is in progress
+  May be used in sections :   defaults | frontend | listen | backend
+ yes   |yes   |   yes  |   no
+  Arguments : none
+
+  By default, idle connections will be closed during a soft stop. In some
+  environments, a client talking to the proxy may have prepared some idle
+  connections in order to send requests later. If there is no proper retry on
+  write errors, this can result in errors while haproxy is reloading. Even
+  though a proper implementation should retry on connection/write errors, this
+  option was introduced to support back compat with haproxy < v2.4. Indeed
+  before v2.4, we were waiting for a last request to be able to add a
+  "connection: close" header and advice the client to close the connection.
+
+  In a real life example, this behavior was seen in AWS using the ALB in front
+  of a haproxy. The end result was ALB sending 502 during haproxy reloads.
+
+
 external-check command 
   Executable to run when performing an external-check
   May be used in sections :   defaults | frontend | listen | backend
diff --git a/include/haproxy/proxy-t.h b/include/haproxy/proxy-t.h
index 8ca4e818d..421f900e2 100644
--- a/include/haproxy/proxy-t.h
+++ b/include/haproxy/proxy-t.h
@@ -82,7 +82,7 @@ enum PR_SRV_STATE_FILE {
 #define PR_O_REUSE_ALWS 0x000C  /* always reuse a shared connection */
 #define PR_O_REUSE_MASK 0x000C  /* mask to retrieve shared connection 
preferences */
 
-/* unused: 0x10 */
+#define PR_O_IDLE_CLOSE_RESP 0x0010 /* avoid closing idle connections 
during a soft stop */
 #define PR_O_PREF_LAST  0x0020  /* prefer last server */
 #define PR_O_DISPATCH   0x0040  /* use dispatch mode */
 #define PR_O_FORCED_ID  0x0080  /* proxy's ID was forced in the 
configuration */
diff --git a/src/mux_h1.c b/src/mux_h1.c
index 7b6ab946d..1ec6cb77c 100644
--- a/src/mux_h1.c
+++ b/src/mux_h1.c
@@ -2999,7 +2999,8 @@ static int h1_process(struct h1c * h1c)
 */
if (!(h1c->flags & H1C_F_IS_BACK)) {
if (unlikely(h1c->px->flags & (PR_FL_DISABLED|PR_FL_STOPPED))) {
-   if (h1c->flags & H1C_F_WAIT_NEXT_REQ)
+   if (!(h1c->px->options & PR_O_IDLE_CLOSE_RESP) &&
+   h1c->flags & H1C_F_WAIT_NEXT_REQ)
goto release;
}
}
diff --git a/src/proxy.c b/src/proxy.c
index ff5e35e2c..245b6f8a5 100644
--- a/src/proxy.c
+++ b/src/proxy.c
@@ -80,6 +80,7 @@ const struct cfg_opt cfg_opts[] =
 

Re: EOL 1.7 ? [was: Absolute EoL dates for haproxy]

2022-01-05 Thread Willy Tarreau
On Wed, Jan 05, 2022 at 05:07:16PM +0100, Tim Düsterhus wrote:
> > This makes me think that this should also mark the turn for 2.0 to
> > enter the "critical fixes only" status. We all know it doesn't mean
> > much, beyond giving us an excuse for producing releases less often,
> > but this is also what such users of a 3-years old version are usually
> > counting on. Any objection ?
> 
> I'd suggest to mark 2.0 as "critical fixes only" in May - 2 years before the
> EOL in Q2 2024. We did the same for 1.8 which was marked as critical fixes
> only at the end of 2020 (i.e. 2 years before Q4 2022).

Yeah that's a good point, that works for me (even less changes to write) :-)

Cheers,
Willy



Re: EOL 1.7 ? [was: Absolute EoL dates for haproxy]

2022-01-05 Thread Tim Düsterhus

Willy,

On 1/5/22 5:01 PM, Willy Tarreau wrote:

Semi-related: 1.7 should probably receive its final release and then be
marked as EOL, as Q4 2021 definitely is in the past now :-)


Indeed. I had a look at the relevant patches since the last version,
and am only seeing minor ones. As such I'd rather not emit any new
version delivering a bad signal, than risking to break something
that possibly works well enough for users.


As there are no pending commits, there's also nothing to release :-)


This makes me think that this should also mark the turn for 2.0 to
enter the "critical fixes only" status. We all know it doesn't mean
much, beyond giving us an excuse for producing releases less often,
but this is also what such users of a 3-years old version are usually
counting on. Any objection ?


I'd suggest to mark 2.0 as "critical fixes only" in May - 2 years before 
the EOL in Q2 2024. We did the same for 1.8 which was marked as critical 
fixes only at the end of 2020 (i.e. 2 years before Q4 2022).


Best regards
Tim Düsterhus



EOL 1.7 ? [was: Absolute EoL dates for haproxy]

2022-01-05 Thread Willy Tarreau
Hi Tim,

On Tue, Jan 04, 2022 at 06:41:14PM +0100, Tim Düsterhus wrote:
> On 12/19/21 7:36 PM, Nemo wrote:
> > Is there any way for users to find out exact EoL dates in advance, or is
> > there an accepted answer for what Q1/Q2/Q3/Q4 would usually mean here?
> 
> I believe you missed Nemo's email and you're probably in the best (and
> only?) position to answer it.

Yep, thanks for pointing that one, I indeed let it rot and forgot about
it.

> Semi-related: 1.7 should probably receive its final release and then be
> marked as EOL, as Q4 2021 definitely is in the past now :-)

Indeed. I had a look at the relevant patches since the last version,
and am only seeing minor ones. As such I'd rather not emit any new
version delivering a bad signal, than risking to break something
that possibly works well enough for users.

Question to any possible user of 1.7: is anyone currently on 1.7.14
(the latest one) facing a problem upgrading to a more recent branch,
and waiting for a particular fix to be backported into 1.7 for the
time it takes to solve the upgrade issue ? Or is anyone waiting for
a specific fix there for a deployment that's about to reach EOL in a
few weeks at best ?

Without positive responses by the end of the week, I guess we could
then declare it EOL and change its color in the table.

This makes me think that this should also mark the turn for 2.0 to
enter the "critical fixes only" status. We all know it doesn't mean
much, beyond giving us an excuse for producing releases less often,
but this is also what such users of a 3-years old version are usually
counting on. Any objection ?

Cheers,
Willy



Re: Absolute EoL dates for haproxy

2022-01-05 Thread Willy Tarreau
Hi Nemo,

first, sorry for missing your message, I remember noticing it,
postponing the response, then I forgot about it.

On Mon, Dec 20, 2021 at 12:06:28AM +0530, Nemo wrote:
> Reaching out on behalf of endoflife.date[0],

I wasn't aware of this project, this can indeed be quite useful for
various distros or project that rely on external dependencies!

> where we had a recent PR
> for adding Haproxy[1]. haproxy does a really good job overall on this
> front. For eg, I really appreciate how well documented the cycles are
> (especially the "Hide/Show unmaintained" button).

Thanks, happy that some find it useful :-)

> The one challenge we're facing is with respect to EoL dates, which we
> prefer to be absolute and complete[2]. Since haproxy EoL dates are
> -Qn, it is hard for know in advance when a specific branch goes EoL.
> 
> We're currently being conservative and using "start of quarter" as the
> dates, but that leads to inaccuracies like 1.7 being marked as EoL[3]
> when it is not.
> 
> Is there any way for users to find out exact EoL dates in advance, or is
> there an accepted answer for what Q1/Q2/Q3/Q4 would usually mean here?

EOL dates are never precisely decided upfront, they're general guidance
and are agreed upon with users. For example, 1.9 was supposed to die early
in 2020-Q2 (possibly as early as April) and the last release was finally
emitted on 31-July. Same goes for 1.6 which received updates till end of
March 2020 despite being considered dead since end of previous year.

And this state is intentional. The first purpose is to progressively
increase the pressure on users who are really late without dropping them
in the void. I prefer that someone asks "please could we have one more
version in this branch for the time it takes us to upgrade" rather than
knowing they're forced to run with a vulnerable one! The second reason
is to not encourage users to deploy one last version that's possibly
wrong on the day it's emitted, and to feel relieved because "surely
it's rock solid now". That's a bad promise. If we emit a version it
usually means we agree to fix it in case we broke it, hence the common
message basically saying "no further release SHOULD be expected in that
branch".

Also software doesn't live only from updates but from support from the
community and developers. What's the point of upgrading to a version if
the day it's emitted it's considered EOL and everyone refuses to respond
to questions about it on the grounds that it's EOL ? That doesn't stand.
And who am I to tell the community "please do not respond to questions
about 1.6 since it's dead" ? Some of them might be perfectly valid, and
relevant to newer branches, but having such indications is useful for
anyone to say "you should really upgrade now, your version is EOL or
about to reach EOL".

I know that some users would like to have an exact date so that they can
feel like they're covered till the last day. And that's exactly what I
want to avoid. Think about a consultant deploying the version the day
before EOL. He might even be covered by a contract for doing this! Or
the customer might insist that he does it. Here it's easier to say "no".

As such, the EOL statement has to be read this way: with the end of life
expected to happen within the announced period, we're free to stop working
on it during that period and you should not expect to see any update nor
receive help during nor after that period if nobody explicitly asks. And
it turns out that it works extremely well, encouraging smooth upgrades,
and users not feeling guilty to ask about unmaintained setups "just in
case".

Thus if you absolutely need to use a complete date, better be conservative
so that users are not surprised, and consider that the last support day is
the last day preceding the EOL period. This will match the minimum we'll do
(useful for distros), and that will not prevent us from continuing to emit
one or two versions past that date if needed to help some users in trouble,
but it will remain a good warning that it's becoming risky to count on that
possible margin.

> For already-EoL releases, providing complete dates should be easier I guess?

We could indeed adjust the entries to reflect the date of when it was
last announced that the version was dead as we did in the past (only
1.6, 1.9 and 2.1 weren't updated like this). I don't think it brings
much benefit though. And as you can see there's already the date of
the last release in that branch anyway.

Thanks,
Willy