Re: Clients occasionally see truncated responses

2021-04-09 Thread Nathan Konopinski
Thanks, Willy. I appreciate you taking the time to send a patch.
Unfortunately, I'm still seeing the errors and multiple records being sent
back to the clients

When you say,
I've just rechecked in the code and indeed, by default it will not
limit.
Does that mean tune.idletimer allows for the unlimited size buffer by
default? The way I read the docs is that there must be at least 1ms idle
before the larger buffers are available (
http://cbonte.github.io/haproxy-dconv/2.2/configuration.html#tune.idletimer).
I'm curious if there are some cases where they aren't getting used.
Although, it doesn't seem likely.

Thanks for the info on streams.

On Fri, Apr 9, 2021 at 1:53 AM Willy Tarreau  wrote:

> Hi Nathan,
>
> On Thu, Apr 08, 2021 at 06:15:10PM -0700, Nathan Konopinski wrote:
> > One other behavior I've observed, nginx has an ssl_buffer_size (
> > http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_buffer_size)
> > option that defaults to 16kb. As I decrease its value, I start seeing the
> > same body size != content length errors from clients.
>
> Interesting, really, as it definitely confirms an issue with the way the
> client handles TLS.
>
> > I tried experimenting with tune.idletimer to force the use of large ssl
> > buffers (tune.ssl.maxrecord's default of no limit seems > than 16kb so I
> > didn't bother adjusting it) but I'm still seeing errors.
>
> I've just rechecked in the code and indeed, by default it will not
> limit.
>
> > From my reading of
> > the docs, a stream must be idle for 1s by default before the larger ssl
> > buffers are used. It also seems a stream must be idle for at least 1ms
> > before the large buffers are available. Is that correct?
>
> I don't remember these details but the maxrecord is only an upper bound
> so if you didn't set it it won't have any effect.
>
> > Is there a way to
> > always use the larger ssl buffers without this detection?
>
> Based on your observation, I suspect that for whatever reason, the client
> is bogus and expects certain data to be located within a single SSL record,
> which is a violation of the standard, but bugs are present :-/
>
> What might very likely happen here is that some records are shorter
> than others just because the buffer wraps, and the openssl API doesn't
> provide a vector-based send function. So you can't say "send 1kB till
> the end of the buffer and loop back to the beginning for 3 extra kB".
> Instead you have to perform one 1kB write and a second 3kB write,
> resulting in two records.
>
> One workaround could be the following patch which always unwraps the
> buffer before sending, always resulting in the largest possible record:
>
> diff --git a/src/ssl_sock.c b/src/ssl_sock.c
> index f1b0668ab..2b47e6b10 100644
> --- a/src/ssl_sock.c
> +++ b/src/ssl_sock.c
> @@ -6071,6 +6071,7 @@ static size_t ssl_sock_to_buf(struct connection
> *conn, void *xprt_ctx, struct bu
>  static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx,
> const struct buffer *buf, size_t count, int flags)
>  {
> struct ssl_sock_ctx *ctx = xprt_ctx;
> +   struct buffer aligned_buf = *buf;
> ssize_t ret;
> size_t try, done;
>
> @@ -6093,6 +6094,13 @@ static size_t ssl_sock_from_buf(struct connection
> *conn, void *xprt_ctx, const s
>  #endif
>
> try = b_contig_data(buf, done);
> +
> +   if (try < (b_data(buf) - done) && try < count) {
> +   b_slow_realign(_buf, trash.area, done);
> +   try = b_contig_data(_buf, done);
> +   buf = _buf;
> +   }
> +
> if (try > count)
> try = count;
>
> It's not pretty but it should do the job by making certain that every
> time the record size is limited by wrapping, the buffer is first
> unwrapped and the record is maximized. I'm also attaching it to help
> you apply it. It was made for 2.4-dev but I verified that it still
> applies (with an offset) on 2.2. I'd be interested in knowing if that
> addresses your problem.
>
> > I'm not sure what a stream represents in this context.
>
> It's a bidirectional communication between a client and a server in
> the context of a single HTTP transaction. You can see this as the
> application level in haproxy, which deals with request/response
> forwarding between the two sides and which executes in turn all the
> analysers and filters corresponding to the rules in your config.
> You can have a look at doc/internals/muxes.png for a quick overview
> (the stream is the entity at the top with carries the request buffer).
>
> Willy
>


Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Robin H. Johnson
On Fri, Apr 09, 2021 at 10:14:26PM +0200, Christopher Faulet wrote:
> It seems you have a blocking call in one of your lua script. The threads dump 
> shows many threads blocked in hlua_ctx_init. Many others are executing lua. 
> Unfortunately, for a unknown reason, there is no stack traceback.
All of our Lua is string handling. Parsing headers, and then building
TXN variables as well as a set of applets that return responses in cases
where we don't want to go to a backend server as the response is simple
enough to generate inside the LB.

> For the 2.3 and prior, the lua scripts are executed under a global lock. Thus 
> blocking calls in a lua script are awful, because it does not block only one 
> thread but all others too. I guess the same issue exists on the 1.8, but 
> there 
> is no watchdog on this version. Thus, time to time HAProxy hangs and may 
> report 
> huge latencies but, at the end it recovers and continues to process data. It 
> is 
> exactly the purpose of the watchdog, reporting hidden bugs related to 
> spinning 
> loops and deadlocks.
Nothing in this Lua code should have blocking calls at all.
The Lua code has zero calls to external services, no sockets, no sleeps,
no print, no Map.new (single call in the Lua startup, not inside any
applet or fetch), no usage of other packages, no file handling, no other
IO.

I'm hoping I can get $work to agree to fully open-source the Lua, so you
can see this fact and review the code to confirm that it SHOULD be
non-blocking.

> However, I may be wrong. It may be just a contention problem because your are 
> executing lua with 64 threads and a huge workload. In this case, you may give 
> a 
> try to the 2.4 (under development). There is a way to have a separate lua 
> context for each thread loading the scripts with "lua-load-per-thread" 
> directive. Out of curiosity, on the 1.8, are you running HAProxy with several 
> threads or are you spawning several processes?
nbthread=64, nbproc=1 on both 1.8/2.x

Yes, we're hoping to try 2.4.x, just working on some parts to get there.

-- 
Robin Hugh Johnson
E-Mail : robb...@orbis-terrarum.net
Home Page  : http://www.orbis-terrarum.net/?l=people.robbat2
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


signature.asc
Description: PGP signature


Re: [PATCH] MINOR: sample: add json_string

2021-04-09 Thread Aleksandar Lazic

Tim.

On 09.04.21 18:55, Tim Düsterhus wrote:

Aleks,


> I have taken a first look. Find my remarks below. Please note that for the 
actual
> source code there might be further remarks by Willy (put in CC) or so. I 
might have
> missed something or might have told you something incorrect. So maybe before 
making
> changes wait for their opinion.

Thank you for your feedback.

> Generally I must say that I don't like the mjson library, because it uses 
'int' for
> sizes. It doesn't really bring the point home that it is a safe library. This 
one
> looks much better to me: https://github.com/FreeRTOS/coreJSON. It does not 
support
> JSON path, though. Not sure how much of an issue that would be?

Well I have created a issue in coreJSON how to handle the "." in the key.
https://github.com/FreeRTOS/coreJSON/issues/92

I have choose the mjson library because it was small and offers the JSON path 
feature.


On 4/8/21 10:21 PM, Aleksandar Lazic wrote:

From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001
From: Alekesandar Lazic 
Date: Thu, 8 Apr 2021 21:42:00 +0200
Subject: [PATCH] MINOR: sample: add json_string


I'd add 'converter' to the subject line to make it clear that this is a 
conveter.



This sample get's the value of a JSON key


Typo: It should be 'gets'.


---
 Makefile |    3 +-
 doc/configuration.txt    |   15 +
 include/import/mjson.h   |  213 +
 reg-tests/sample_fetches/json_string.vtc |   25 +
 src/mjson.c  | 1052 ++
 src/sample.c |   63 ++
 6 files changed, 1370 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/sample_fetches/json_string.vtc
 create mode 100644 src/mjson.c

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 01a01eccc..7f2732668 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -19043,6 +19043,21 @@ http_first_req : boolean


This is the 'fetch' section. Move the documentation to the 'converter' section.

>

   from some requests when a request is not the first one, or to help grouping
   requests in the logs.

+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


maybe json_get_string because there could be some more getter like bool, int, 
...


+  Returns the string value of the given json path.


It should be "JSON" (in uppercase) here and everywhere else.


Okay and agree.


+  The  is required.
+  This sample uses the mjson library https://github.com/cesanta/mjson
+  The json path syntax is defined in this repo 
https://github.com/json-path/JsonPath


Overall the description of the converter does not read nicely / feels 
inconsistent compared to other converters / uses colloquial language.


Let me suggest something like:

Extracts the value located at  from the JSON string in the input 
value.  must be a valid JsonPath string as defined at https://goessner.net/articles/JsonPath/


I changed the link, because that appears to be the canonical reference.


Okay.


+  Example :


No space in front of the colon.


+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+ +  # get the value from the key iss
+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A more 
general example would be preferred, because it makes it easier to understand the concept.


The '$.iss' is the generic JWT field.
https://tools.ietf.org/html/rfc7519#section-4.1
"iss" (Issuer) Claim

But maybe I could look for a "normal" JSON sting and only JWT.



 method : integer + string
   Returns an integer value corresponding to the method in the HTTP request. For
   example, "GET" equals 1 (check sources to establish the matching). Value 9
diff --git a/include/import/mjson.h b/include/import/mjson.h
new file mode 100644
index 0..ff46e7950
--- /dev/null
+++ b/include/import/mjson.h
@@ -0,0 +1,213 @@
[...]
+// Aleksandar Lazic
+// git clone from 2021-08-04 because of this fix
+// 
https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee


Please don't edit third party libraries, even if it is just a comment. This 
makes updating hard.


Okay.


diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc
new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. And please make 
sure you understand the difference between fetches and converters.


Yeah the difference between fetchers and converters in 

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Christopher Faulet

Le 09/04/2021 à 19:26, Robin H. Johnson a écrit :

Hi,

Maciej had said they were going to create a new thread, but I didn't see
one yet.

I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and
that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big
thanks for that initial work in fixing the issue.

As I mentioned in my other mail asking for a 1.8.30 release, we're
experiencing this problem in DigitalOcean's HAProxy instances used to
run the Spaces product.

I've been trying to dig out deeper detail as well with a debug threads
version, but I have the baseline error output from 2.3.9 here to share,
after passing redaction review. This dump was generated with vbernat's
PPA version of 2.3.9, not any internal builds.

We have struggled to reproduce the problem in testing environments, it
only turns up at the biggest regions, and plotting occurances of the
issue over the time dimension suggest that it might have some partial
correlation w/ a weird workload input.

The dumps do suggest Lua is implicated as well, and we've got some
extensive Lua code, so it's impossible to rule it out as contributing to
the problem (We have been discussing plans to move it to SPOA instead).

The Lua code in question hasn't changed significantly in nearly 6
months, and it was problem-free on the 1.8 series (having a test suite
for the Lua code has been invaluable).



Hi,

It seems you have a blocking call in one of your lua script. The threads dump 
shows many threads blocked in hlua_ctx_init. Many others are executing lua. 
Unfortunately, for a unknown reason, there is no stack traceback.


For the 2.3 and prior, the lua scripts are executed under a global lock. Thus 
blocking calls in a lua script are awful, because it does not block only one 
thread but all others too. I guess the same issue exists on the 1.8, but there 
is no watchdog on this version. Thus, time to time HAProxy hangs and may report 
huge latencies but, at the end it recovers and continues to process data. It is 
exactly the purpose of the watchdog, reporting hidden bugs related to spinning 
loops and deadlocks.


However, I may be wrong. It may be just a contention problem because your are 
executing lua with 64 threads and a huge workload. In this case, you may give a 
try to the 2.4 (under development). There is a way to have a separate lua 
context for each thread loading the scripts with "lua-load-per-thread" 
directive. Out of curiosity, on the 1.8, are you running HAProxy with several 
threads or are you spawning several processes?


--
Christopher Faulet



Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Maciej Zdeb
Hi Robin,

W dniu pt., 9.04.2021 o 19:26 Robin H. Johnson 
napisał(a):

> Maciej had said they were going to create a new thread, but I didn't see
> one yet.
>
> I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and
> that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big
> thanks for that initial work in fixing the issue.
>

After Christopher patch using another function (that does not allocate
memory) to dump LUA stack trace the situation is much better. It used 100%
cpu only once and only one thread, however HAProxy didn't kill itself like
in your situation. Since then the issue did not occur again, I have
anything useful to share. :(

I'm using 2.2.11 HAProxy but will upgrade next week to 2.3, I'll report if
anything interesting pops out.


> We have struggled to reproduce the problem in testing environments, it
> only turns up at the biggest regions, and plotting occurances of the
> issue over the time dimension suggest that it might have some partial
> correlation w/ a weird workload input.
>

On our side the issue occured on a machine with unusual workload, hundred
thousands tcp connections and very little http requests/responses. We also
use lua.

Kind regards,
Maciej


Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-09 Thread Robin H. Johnson
Hi,

Maciej had said they were going to create a new thread, but I didn't see
one yet.

I want to start by noting problem was much worse on 2.2.8 & 2.2.9, and
that 2.2.13 & 2.3.9 don't get entirely hung at 100% anymore: a big
thanks for that initial work in fixing the issue.

As I mentioned in my other mail asking for a 1.8.30 release, we're
experiencing this problem in DigitalOcean's HAProxy instances used to
run the Spaces product.

I've been trying to dig out deeper detail as well with a debug threads
version, but I have the baseline error output from 2.3.9 here to share,
after passing redaction review. This dump was generated with vbernat's
PPA version of 2.3.9, not any internal builds.

We have struggled to reproduce the problem in testing environments, it
only turns up at the biggest regions, and plotting occurances of the
issue over the time dimension suggest that it might have some partial
correlation w/ a weird workload input.

The dumps do suggest Lua is implicated as well, and we've got some
extensive Lua code, so it's impossible to rule it out as contributing to
the problem (We have been discussing plans to move it to SPOA instead).

The Lua code in question hasn't changed significantly in nearly 6
months, and it was problem-free on the 1.8 series (having a test suite
for the Lua code has been invaluable).

-- 
Robin Hugh Johnson
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


20210409_haproxy-crashlogs.REDACTED.txt.gz
Description: Binary data


signature.asc
Description: PGP signature


Re: Request for new 1.8.x release due to freq counter bug

2021-04-09 Thread Amaury Denoyelle
"Robin H. Johnson"  wrote:

> Hi,
> Wondering if you could make a new 1.8.x release to get the fix for the
> freq counter bug into more systems?
> dde80e111 BUG/MEDIUM: time: make sure to always initialize the global tick
> 491b86ed0 BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable
> It's a problem for anybody using counters for ratelimiting purposes, who
> can't yet upgrade to 2.x series due to other issues encountered there
> (the 100% cpu bug even on the latest 2.3.9, that I have logs I'm hoping
> to post when they are approved by $work)

Hi,

I will try to emit new releases next week for 2.0 and 1.8 which are
impacted by the bug you mentionned.

-- 
Amaury Denoyelle



Re: [PATCH] MINOR: sample: add json_string

2021-04-09 Thread Tim Düsterhus

Aleks,

I have taken a first look. Find my remarks below. Please note that for 
the actual source code there might be further remarks by Willy (put in 
CC) or so. I might have missed something or might have told you 
something incorrect. So maybe before making changes wait for their opinion.


Generally I must say that I don't like the mjson library, because it 
uses 'int' for sizes. It doesn't really bring the point home that it is 
a safe library. This one looks much better to me: 
https://github.com/FreeRTOS/coreJSON. It does not support JSON path, 
though. Not sure how much of an issue that would be?


On 4/8/21 10:21 PM, Aleksandar Lazic wrote:

From 7ecb80b1dfe37c013cf79bc5b5b1caa3c0112a6a Mon Sep 17 00:00:00 2001
From: Alekesandar Lazic 
Date: Thu, 8 Apr 2021 21:42:00 +0200
Subject: [PATCH] MINOR: sample: add json_string


I'd add 'converter' to the subject line to make it clear that this is a 
conveter.




This sample get's the value of a JSON key


Typo: It should be 'gets'.


---
 Makefile |3 +-
 doc/configuration.txt|   15 +
 include/import/mjson.h   |  213 +
 reg-tests/sample_fetches/json_string.vtc |   25 +
 src/mjson.c  | 1052 ++
 src/sample.c |   63 ++
 6 files changed, 1370 insertions(+), 1 deletion(-)
 create mode 100644 include/import/mjson.h
 create mode 100644 reg-tests/sample_fetches/json_string.vtc
 create mode 100644 src/mjson.c

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 01a01eccc..7f2732668 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -19043,6 +19043,21 @@ http_first_req : boolean


This is the 'fetch' section. Move the documentation to the 'converter' 
section.



   from some requests when a request is not the first one, or to help grouping
   requests in the logs.
 
+json_string() : string

I don't like the name. A few suggestions:

- json_query
- json_get
- json_decode


+  Returns the string value of the given json path.


It should be "JSON" (in uppercase) here and everywhere else.


+  The  is required.
+  This sample uses the mjson library https://github.com/cesanta/mjson
+  The json path syntax is defined in this repo 
https://github.com/json-path/JsonPath


Overall the description of the converter does not read nicely / feels 
inconsistent compared to other converters / uses colloquial language.


Let me suggest something like:

Extracts the value located at  from the JSON string in the 
input value.  must be a valid JsonPath string as defined at 
https://goessner.net/articles/JsonPath/


I changed the link, because that appears to be the canonical reference.


+  Example :


No space in front of the colon.


+  # get the value from the key kubernetes.io/serviceaccount/namespace
+  # => openshift-logging
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.kubernetes\\.io/serviceaccount/namespace')
+  
+  # get the value from the key iss

+  # => kubernetes/serviceaccount
+  http-request set-var(sess.json) 
req.hdr(Authorization),b64dec,json_string('$.iss')


I don't like that the example is that specific for Kubernetes usage. A 
more general example would be preferred, because it makes it easier to 
understand the concept.



 method : integer + string
   Returns an integer value corresponding to the method in the HTTP request. For
   example, "GET" equals 1 (check sources to establish the matching). Value 9
diff --git a/include/import/mjson.h b/include/import/mjson.h
new file mode 100644
index 0..ff46e7950
--- /dev/null
+++ b/include/import/mjson.h
@@ -0,0 +1,213 @@
[...]
+// Aleksandar Lazic
+// git clone from 2021-08-04 because of this fix
+// 
https://github.com/cesanta/mjson/commit/7d8daa8586d2bfd599775f049f26d2645c25a8ee


Please don't edit third party libraries, even if it is just a comment. 
This makes updating hard.



diff --git a/reg-tests/sample_fetches/json_string.vtc 
b/reg-tests/sample_fetches/json_string.vtc
new file mode 100644
index 0..fc387519b
--- /dev/null
+++ b/reg-tests/sample_fetches/json_string.vtc


Again, this is a converter. Move the test into the appropriate folder. 
And please make sure you understand the difference between fetches and 
converters.



@@ -0,0 +1,25 @@
+varnishtest "Test to get value from json sting"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+haproxy h1 -conf {
+defaults
+mode http
+timeout connect 1s
+timeout client  1s
+timeout server  1s
+
+frontend fe1
+bind "fd@${fe1}"
+http-request set-var(sess.iss) 
req.hdr(Authorization),b64dec,json_string('$.iss')
+http-request return status 200 hdr x-var "json-value=%[var(sess.iss)]"
+} -start
+
+client c1 -connect ${h1_fe1_sock} {
+txreq -req GET -url "/" \
+  -hdr "Authorization: 

[ANNOUNCE] haproxy-2.4-dev16

2021-04-09 Thread Willy Tarreau
Hi,

HAProxy 2.4-dev16 was released on 2021/04/09. It added 37 new commits
after version 2.4-dev15.

This one is particularly calm, I even hesitated between making it or not.
But there are a few updates that may affect configuration so I figured it
was better to emit a new one.

A large part of the patch is essentially caused by the renaming of a few
atomic ops that were causing some confusion. Now HA_ATOMIC_FOO() will be
a void statement so that if you want to read from the location you use
either HA_ATOMIC_FETCH_FOO() or HA_ATOMIC_FOO_FETCH() (pre- or post-
fetch).  The output code shouldn't change however, and given that it was
essentially sed-work, as soon as it started to work I was confident in it.

A few changes in the FD code to clean up that messy fdtab structure cause
another noticeable part of the diff. I obviously managed to break
something once (hence the BUG/MAJOR fix) but now it's OK. Mistakes at this
level are never forgiving anyway, either it fully works or it fully fails.

The nice part that makes me think we're progressively approaching what
could look like the release is that Emeric finally performed the few
changes in the DNS and log code. For the DNS, the TCP servers in the
"resolvers" section do not need to be referred to as "server" anymore,
they are "nameserver" just like the other ones, except that you can
mention "tcp@" in front of their address to force them to be TCP
nameservers. No more configuration mixing there. And for the log servers,
similarly, now you can specify "tcp@" in front of an address on a "log"
statement, and it will automatically create the ring for you with default
settings. Previously it was still required to manually declare the ring, I
found this too cumbersome, and Emeric figured how to handle this.

The rest is essentially small bug fixes and code cleanups from a bunch
of contributors.

Now speaking about the pending stuff I'm still aware of:

  - I'd like to rename LIST_ADD/LIST_ADDQ to LIST_INSERT/LIST_APPEND
(or maybe LIST_ADD_HEAD/LIST_ADD_TAIL ?). I've already been trapped
in using LIST_ADD() when I wanted to append and I know the confusion
is easy. That's just a quick "sed" once we know what we want.

  - I identified some undesired cache line sharing around some pointers,
that slow down FD operations and possibly other stuff. I see how to
arrange this, it just needs to be done (and tested on MacOS since we
noticed once that section names differ there).

  - we've had a recent discussion about the opportunity to import SLZ
and to always enable compression by default using it if no other lib
is specified. I think it could be useful for some users (especially
those for whom memory usage is important). I'll have a look at this,
maybe next week, that's only two files to include.

  - regarding the quick discussion two weeks ago about optimization for
modern ARM CPUs, I saw that one solution could be to build using
gcc 10.2+ which outline their atomics into functions. That's ugly
but the performance impact is small (about 3% in my tests), while
it provides a tremendous improvement for many-core machines. But if
we rely on this do this I'll probably add two new CPU entries to
force to use only an old one (v8.0) or only a modern one (v8.2) so
that those who build themselves can shave the last few percent.

  - no progress made on the default-path, but we'll have to reread the
issue to figure the best choice. I'd like to see it done for the
release to ease config packaging and deployments.

  - I'd like to add a warning when we detect that nbthreads is forced
to a higher value than the number of bound CPUs. It's not the first
time that I see this in a config and the result is catastrophic, so
a warning is deserved. It just needs to be set at the right place.

  - the shared memory pools cleanup must be completed before the release,
as the situation is not good for those with a slow malloc() function
at the moment. I know what to do, I just need to stay focused on it.

  - the date & time cleanups would be nice to have for the long term and
are not particularly hard to do so better finish them before 2.4.

  - Tim sent a patch series to implement URI normalization. That's something
I'd like to see merged if possible, as it may improve security for some
users, and at least improve reliability for others.

  - I also noticed Alek's mjson import with a new converter, but didn't
have a look yet. Maybe it will open opportunities for more converters,
that's definitely something which deserves being considered before the
release.

  - Amaury has almost finished some work to improve auto-binding on NUMA
systems. Right now using a dual-socket machine is common an painful at
the same time. Having threads enabled by default doesn't improve the
situation there at all due to the slow communication between the CPUs.
In his tests he could 

Re: [RFC PATCH 0/8] URI normalization / Issue #714

2021-04-09 Thread Christopher Faulet

Le 08/04/2021 à 20:59, Tim Duesterhus a écrit :

Willy,
Christopher,

Not sure who of you is better suited to review this series, so I'm adding both
of you :-)

I'm tagging this as RFC, because it's large and quite a bit outside of my
comfort zone. However the patches are as clean as possible. They include full
documentation and each normalizer comes with a bunch of reg-tests ensuring they
behave like I expect them to behave. So if you have nothing to complain, then
this series is in a mergeable state. I plan to add a few more normalizers after
this passes an initial review.

I'll add additional remarks to each patch, explaining my decisions in more
detail.



Thanks Tim, I'll try to review your patches next week.


--
Christopher Faulet



Re: haproxy 2.4 opentracing using x-b3-traceid header

2021-04-09 Thread Andrea Bonini
I've tried to add header adding the context name but with no luck (i thinks
it's done after ot read the request)

here an example of headers that i receive

X-B3-TraceId: 48485a3953bb6124
X-B3-ParentSpanId: 48485a3953bb6124
X-B3-SpanId: 42e1e27066118383
X-B3-Sampled: 1
X-B3-Flags: 0

here the documentation about propagation (there's more format for the
propagation...but this is the more common)
https://github.com/openzipkin/b3-propagation

Thanks


Il giorno ven 9 apr 2021 alle ore 13:52 Miroslav Zagorac <
miros...@zagorac.name> ha scritto:

> On 04/09/2021 01:40 PM, Andrea Bonini wrote:
> > Hi Zaga,
> > thank you for the explanation. My problem is that the external service
> send
> > the span info in header as x-b3-* format (the opentracing default one).
> > For reading it, which extract context must i configure? if i don't
> > understand wrong if i want read x-b3-* header i must use a context with
> > empty name that is not allowed.
> >
>
> Hello Andrea,
>
> in this case the ability to accept the context with an empty name should
> be added to the OpenTracing filter.  I will think about it.  If this is
> not a problem, send the header content for one such example (only those
> headers related to OpenTracing, with complete name and content).
>
> Thank you for the question and comment.
>
> Best regards,
>
> --
> Zaga
>
> What can change the nature of a man?
>


-- 
Bonini Andrea


Re: haproxy 2.4 opentracing using x-b3-traceid header

2021-04-09 Thread Miroslav Zagorac

On 04/09/2021 01:40 PM, Andrea Bonini wrote:

Hi Zaga,
thank you for the explanation. My problem is that the external service send
the span info in header as x-b3-* format (the opentracing default one).
For reading it, which extract context must i configure? if i don't
understand wrong if i want read x-b3-* header i must use a context with
empty name that is not allowed.



Hello Andrea,

in this case the ability to accept the context with an empty name should 
be added to the OpenTracing filter.  I will think about it.  If this is 
not a problem, send the header content for one such example (only those 
headers related to OpenTracing, with complete name and content).


Thank you for the question and comment.

Best regards,

--
Zaga

What can change the nature of a man?



Re: haproxy 2.4 opentracing using x-b3-traceid header

2021-04-09 Thread Andrea Bonini
Hi Zaga,
thank you for the explanation. My problem is that the external service send
the span info in header as x-b3-* format (the opentracing default one).
For reading it, which extract context must i configure? if i don't
understand wrong if i want read x-b3-* header i must use a context with
empty name that is not allowed.

Thanks again

Il giorno ven 9 apr 2021 alle ore 10:43 Miroslav Zagorac <
miros...@zagorac.name> ha scritto:

> On 04/08/2021 01:48 PM, Andrea Bonini wrote:
> > Hi everyone,
> >
> > i have a question about opentracing addon. Is there a way for using
> headers
> > info about a trace created from an outside service?
> >
> > what i want is using x-b3-traceid header from an incoming request as
> > traceid and use spanid as parentid
> >
> > Thanks
> >
>
> Hello Andrea,
>
> you are interested in propagating the span context from one service to
> another.  This can be done by setting up additional http headers in the
> service that created the tracing process that contain the span context
> which is then read in another process that wants to add its data.  This
> can be done via the ‘inject’ and ‘extract’ keywords in the opentracing
> filter configuration.  In this way they can automatically set up and
> read the contents of the headers over which the span context is
> transmitted.
>
> In the test directory there is one example that uses such a mode of
> operation, where two haproxy processes are started, the first of which
> passes the span context to the second.  The example is run via the shell
> script test/run-fe-be.sh and thus starts the frontend haproxy process
> (fe) which passes the span context to the backend haproxy process (be).
>   Configurations are located in the test/fe and test/be directories.
>
> For more explanations read the README file from the ot directory.
>
> Best regards,
>
> --
> Zaga
>
> What can change the nature of a man?
>


-- 
Bonini Andrea


Request for new 1.8.x release due to freq counter bug

2021-04-09 Thread Robin H. Johnson
Hi,

Wondering if you could make a new 1.8.x release to get the fix for the
freq counter bug into more systems?

dde80e111 BUG/MEDIUM: time: make sure to always initialize the global tick
491b86ed0 BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable

It's a problem for anybody using counters for ratelimiting purposes, who
can't yet upgrade to 2.x series due to other issues encountered there
(the 100% cpu bug even on the latest 2.3.9, that I have logs I'm hoping
to post when they are approved by $work)

-- 
Robin Hugh Johnson
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


signature.asc
Description: PGP signature


Request for new 1.8.x release due to freq counter bug

2021-04-09 Thread Robin H. Johnson
Hi,

Wondering if you could make a new 1.8.x release to get the fix for the
freq counter bug into more systems?

dde80e111 BUG/MEDIUM: time: make sure to always initialize the global tick
491b86ed0 BUG/MEDIUM: freq_ctr/threads: use the global_now_ms variable

It's a problem for anybody using counters for ratelimiting purposes, who
can't yet upgrade to 2.x series due to other issues encountered there
(the 100% cpu bug even on the latest 2.3.9, that I have logs I'm hoping
to post when they are approved by $work)

-- 
Robin Hugh Johnson
E-Mail : robb...@orbis-terrarum.net
Home Page  : http://www.orbis-terrarum.net/?l=people.robbat2
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


signature.asc
Description: PGP signature


Re: Clients occasionally see truncated responses

2021-04-09 Thread Willy Tarreau
Hi Nathan,

On Thu, Apr 08, 2021 at 06:15:10PM -0700, Nathan Konopinski wrote:
> One other behavior I've observed, nginx has an ssl_buffer_size (
> http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_buffer_size)
> option that defaults to 16kb. As I decrease its value, I start seeing the
> same body size != content length errors from clients.

Interesting, really, as it definitely confirms an issue with the way the
client handles TLS.

> I tried experimenting with tune.idletimer to force the use of large ssl
> buffers (tune.ssl.maxrecord's default of no limit seems > than 16kb so I
> didn't bother adjusting it) but I'm still seeing errors.

I've just rechecked in the code and indeed, by default it will not
limit.

> From my reading of
> the docs, a stream must be idle for 1s by default before the larger ssl
> buffers are used. It also seems a stream must be idle for at least 1ms
> before the large buffers are available. Is that correct?

I don't remember these details but the maxrecord is only an upper bound
so if you didn't set it it won't have any effect.

> Is there a way to
> always use the larger ssl buffers without this detection?

Based on your observation, I suspect that for whatever reason, the client
is bogus and expects certain data to be located within a single SSL record,
which is a violation of the standard, but bugs are present :-/

What might very likely happen here is that some records are shorter
than others just because the buffer wraps, and the openssl API doesn't
provide a vector-based send function. So you can't say "send 1kB till
the end of the buffer and loop back to the beginning for 3 extra kB".
Instead you have to perform one 1kB write and a second 3kB write,
resulting in two records.

One workaround could be the following patch which always unwraps the
buffer before sending, always resulting in the largest possible record:

diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index f1b0668ab..2b47e6b10 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -6071,6 +6071,7 @@ static size_t ssl_sock_to_buf(struct connection *conn, 
void *xprt_ctx, struct bu
 static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, const 
struct buffer *buf, size_t count, int flags)
 {
struct ssl_sock_ctx *ctx = xprt_ctx;
+   struct buffer aligned_buf = *buf;
ssize_t ret;
size_t try, done;
 
@@ -6093,6 +6094,13 @@ static size_t ssl_sock_from_buf(struct connection *conn, 
void *xprt_ctx, const s
 #endif
 
try = b_contig_data(buf, done);
+
+   if (try < (b_data(buf) - done) && try < count) {
+   b_slow_realign(_buf, trash.area, done);
+   try = b_contig_data(_buf, done);
+   buf = _buf;
+   }
+
if (try > count)
try = count;
 
It's not pretty but it should do the job by making certain that every
time the record size is limited by wrapping, the buffer is first
unwrapped and the record is maximized. I'm also attaching it to help
you apply it. It was made for 2.4-dev but I verified that it still
applies (with an offset) on 2.2. I'd be interested in knowing if that
addresses your problem.

> I'm not sure what a stream represents in this context.

It's a bidirectional communication between a client and a server in
the context of a single HTTP transaction. You can see this as the
application level in haproxy, which deals with request/response
forwarding between the two sides and which executes in turn all the
analysers and filters corresponding to the rules in your config.
You can have a look at doc/internals/muxes.png for a quick overview
(the stream is the entity at the top with carries the request buffer).

Willy
diff --git a/src/ssl_sock.c b/src/ssl_sock.c
index f1b0668ab..2b47e6b10 100644
--- a/src/ssl_sock.c
+++ b/src/ssl_sock.c
@@ -6071,6 +6071,7 @@ static size_t ssl_sock_to_buf(struct connection *conn, 
void *xprt_ctx, struct bu
 static size_t ssl_sock_from_buf(struct connection *conn, void *xprt_ctx, const 
struct buffer *buf, size_t count, int flags)
 {
struct ssl_sock_ctx *ctx = xprt_ctx;
+   struct buffer aligned_buf = *buf;
ssize_t ret;
size_t try, done;
 
@@ -6093,6 +6094,13 @@ static size_t ssl_sock_from_buf(struct connection *conn, 
void *xprt_ctx, const s
 #endif
 
try = b_contig_data(buf, done);
+
+   if (try < (b_data(buf) - done) && try < count) {
+   b_slow_realign(_buf, trash.area, done);
+   try = b_contig_data(_buf, done);
+   buf = _buf;
+   }
+
if (try > count)
try = count;
 


Re: haproxy 2.4 opentracing using x-b3-traceid header

2021-04-09 Thread Miroslav Zagorac

On 04/08/2021 01:48 PM, Andrea Bonini wrote:

Hi everyone,

i have a question about opentracing addon. Is there a way for using headers
info about a trace created from an outside service?

what i want is using x-b3-traceid header from an incoming request as
traceid and use spanid as parentid

Thanks



Hello Andrea,

you are interested in propagating the span context from one service to 
another.  This can be done by setting up additional http headers in the 
service that created the tracing process that contain the span context 
which is then read in another process that wants to add its data.  This 
can be done via the ‘inject’ and ‘extract’ keywords in the opentracing 
filter configuration.  In this way they can automatically set up and 
read the contents of the headers over which the span context is transmitted.


In the test directory there is one example that uses such a mode of 
operation, where two haproxy processes are started, the first of which 
passes the span context to the second.  The example is run via the shell 
script test/run-fe-be.sh and thus starts the frontend haproxy process 
(fe) which passes the span context to the backend haproxy process (be). 
 Configurations are located in the test/fe and test/be directories.


For more explanations read the README file from the ot directory.

Best regards,

--
Zaga

What can change the nature of a man?