date:20210415

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 06:47:29PM +0200, Tim Düsterhus wrote:
> You know that I'm obsessed with the correct use of data types, so please
> find three CLEANUP patches attached. Feel free to drop the ones you don't
> agree with :-)

I don't disagree with them, but am just having a few comments on style
in turn :-)

>  static int sample_conv_json_query(const struct arg *args, struct sample 
> *smp, void *private)
>  {
>   struct buffer *trash = get_trash_chunk();
> - const char *p; /* holds the temporary string from mjson_find */
> - int tok, n;/* holds the token enum and the length of the value */
> - int rc;/* holds the return code from mjson_get_string */
> + const char *token; /* holds the temporary string from mjson_find */
> + int token_size;/* holds the length of  */
>  
> - tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, 
> args[0].data.str.area, &p, &n);
> + enum mjson_tok token_type = mjson_find(smp->data.u.str.area, 
> smp->data.u.str.data, args[0].data.str.area, &token, &token_size);

As much as possible, please try to avoid assignments in declarations,
especially when using functions. There are several reasons that make
this annoying:
  - sometimes just changing one arg requires to reorder the declarations
  - debugging with them is painful
  - trying to comment some of them out is painful as well
  - it discourages from adding finer control (e.g. if you need to do
this or that using and if/else block, you also have to significantly
modify them).
  - quite a few times I've seen code like this been deleted because when
the compiler says "unused variable foo", it makes one think it's now
OK to delete it, except that the function possibly had a desired side
effect

=> thus better declare the enum with the variables

Same goes for one or two other ones in the switch/case statements.

I'm otherwise OK.

Thanks!
Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 05:44:44PM +0200, Aleksandar Lazic wrote:
> Now the Statement what I wanted to say ;-)
> 
> HAProxy have now at least 4 possibilities to route traffic and
> catch some data.
> 
> HTTP fields
> GRPC fields
> FIX fields
> JSON fields
> 
> Have I missed something?

You can extract various data from TCP payloads or even HTTP bodies,
so it's just a matter of how simple you want things to be done. For
example we've long supported RDP cookies, and I even deployed a
somewhat limited but effective Corba analyser a long time ago :-)

Willy

Re: changed IP messages overrunning /var/log ?

2021-04-15 Thread Jim Freeman

More info on the over-quick+newly-noisy resolves that were triggering this ...

We've been running 1.8.19
(https://packages.debian.org/stretch-backports/haproxy)
with 'hold valid 60s' configured, which was acting-ish like 'timeout
resolve 60s'
(which  was *not* configured).

So when we moved to current 2.0 , with the fix for /issues/345 ,
resolutions which
had been happening every 60s now happened every 1s (the default?), with each
IP change now making noise it had not made before => ergo, logs/disk filled.

Adding 'timeout resolve 60s' reduces the noise by a factor of 60.
The duplication of logging the new(?) 'changed its IP' messages to daemon.log
(when only local* facilities are configured) is still getting
root-cause analysis.
===
https://github.com/haproxy/haproxy/issues/345
https://github.com/haproxy/haproxy/commit/f50e1ac4442be41ed8b9b7372310d1d068b85b33
http://www.haproxy.org/download/1.8/src/CHANGELOG
 * 2019/11/25 : 1.8.23
  * BUG: dns: timeout resolve not applied for valid resolutions

On Thu, Apr 15, 2021 at 1:43 AM Jim Freeman  wrote:
>
> Migrating from stock stretch-backports+1.8 to Debian_10/Buster+2.0 (to
> purge 'reqrep' en route to 2.2), /var/log/ is suddenly filling disk
> with messages about changed IPs :
> ===
> 2021-04-14T01:08:40.123303+00:00 ip-10-36-217-169 haproxy[569]:
> my_web_service changed its IP from 52.208.198.117 to 34.251.174.55 by
> DNS cache.
> ===
> daemon.log and syslog (plus the usual haproxy.log) all get hammered.
>
> Many of the backends (200+) are of the flavor :
> server-template my_web_service 8 my_web_service.herokuapp.com:443 ...
> resolvers fs_resolvers resolve-prefer ipv4
>
> The herokuapp.com addresses change constantly, but this has not been a
> problem heretofore.
>
> This is puzzling, since haproxy.cfg directs all logs to local*
> After some investigation, it turns out that the daemon.log and syslog
> entries arrive via facility.level=daemon.info.  I've made rsyslog cfg
> changes that now stop the haproxy msgs from overrunning daemon.log and
> syslog (and allow only a representative fraction to hit haproxy.log).
>
> Two questions :
>  1) What is different about 2.0 that "changed its IP" entries are so
> voluminous ?
>  2) Why is daemon.info involved in the logging, when the haproxy.cfg
> settings only designate local* facilities ?
>
> Thanks for any insights (and for stupendous software) !
> 
> Running HAProxy 2.0 from :
> https://haproxy.debian.net/#?distribution=Debian&release=buster&version=2.0
>
> on stock Buster AWS AMI :
> https://wiki.debian.org/Cloud/AmazonEC2Image
> https://wiki.debian.org/Cloud/AmazonEC2Image

Re: changed IP messages overrunning /var/log ?

2021-04-15 Thread Jim Freeman

Yes - as a systemd service.  But the puzzlement remains that the same complaints
get logged via *both* daemon.info and local*.info, when local* is all
we configure
in haproxy.cfg.  /var/log gets doubly over-taxed because the daemon.info entries
wind up in both syslog and daemon.log (per stock /etc/rsyslog.conf).

We now avoid that via an entry under rsyslog.d/ :
daemon.info {
  if $programname startswith 'haproxy' then {
stop
  }
}
,  but I'm still curious about why the entries get sent to daemon.info
(systemd's
stdout/stderr ?), when local* is explicitly configured to receive ...

On Thu, Apr 15, 2021 at 2:02 AM Jarno Huuskonen  wrote:
>
> Hello,
>
> On Thu, 2021-04-15 at 01:43 -0600, Jim Freeman wrote:
> > This is puzzling, since haproxy.cfg directs all logs to local*
> > After some investigation, it turns out that the daemon.log and syslog
> > entries arrive via facility.level=daemon.info.  I've made rsyslog cfg
> > changes that now stop the haproxy msgs from overrunning daemon.log and
> > syslog (and allow only a representative fraction to hit haproxy.log).
> >
> > Two questions :
> >  1) What is different about 2.0 that "changed its IP" entries are so
> > voluminous ?
> >  2) Why is daemon.info involved in the logging, when the haproxy.cfg
> > settings only designate local* facilities ?
>
> Are you running haproxy as a systemd service ? Those logs could be
> coming from systemd (haproxy stdout/stderr).
>
> -Jarno
>
> --
> Jarno Huuskonen

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Robin H. Johnson

On Thu, Apr 15, 2021 at 07:53:15PM +, Robin H. Johnson wrote:
> But your thought of CPU pinning was good.
> I went to confirm it in the host, and I'm not certain if the cpu-map is 
> working
> right.
Ignore me, long day and I didn't think to check each thread PID:

# ps -e -T | grep haproxy -w |awk '{print $2}' |xargs -n1 taskset -pc
pid 120689's current affinity list: 0-127
pid 120702's current affinity list: 0
pid 120706's current affinity list: 1
..
pid 120776's current affinity list: 63

-- 
Robin Hugh Johnson
E-Mail : robb...@orbis-terrarum.net
Home Page  : http://www.orbis-terrarum.net/?l=people.robbat2
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


signature.asc
Description: PGP signature

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Robin H. Johnson

On Thu, Apr 15, 2021 at 09:23:07AM +0200, Willy Tarreau wrote:
> On Thu, Apr 15, 2021 at 07:13:53AM +, Robin H. Johnson wrote:
> > Thanks; I will need to catch it faster or automate this, because the
> > watchdog does a MUCH better job restarting it than before, less than 30
> > seconds of 100% CPU before the watchdog reliably kills it.
> I see. Then collecting the watchdog outputs can be instructive to see
> if it always happens at the same place or not. And the core dumps will
> indicate what all threads were doing (and if some were competing on a
> lock for example).
The truncation in the log output for the crash was interesting, I couldn't see
why it was being cut off. I wish we could get a clean reproduction in our
testing environment, because the production core dumps absolutely have customer
data in them.

> > Varnish runs on the same host and is used to cache some of the backends.
> > Please of free memory at the moment.
> 
> I'm now thinking about something. Do you have at least as many CPUs as the
> total number of threads used by haproxy and varnish ? Otherwise there will
> be some competition and migrations will happen. If neither is bounded, you
> can even end up with two haproxy threads forced to run on the same CPU,
> which is the worst situation as one could be scheduled out with a lock
> held and the other one spinning waiting for this lock.
Single socket, AMD EPYC 7702P 64-Core Processor, 128 threads!
Shows as single NUMA node in our present configuration.
Hopefully the kernel is mostly doing the right thing, but read on.

HAProxy already pinned to the first 64 threads with:
cpu-map 1/1 0
...
cpu-map 1/64 63

Varnish isn't explicitly pinned right now, but uses less than 200% CPU
overall (we know most requests aren't cachable so they don't get routed to
Varnish at all)

But your thought of CPU pinning was good.
I went to confirm it in the host, and I'm not certain if the cpu-map is working
right.

# pid_haproxy_leader=68839 ; pid_haproxy_follower=68848 
# taskset -pc $pid_haproxy_leader
pid 68839's current affinity list: 0-127
# taskset -pc $pid_haproxy_follower
pid 68848's current affinity list: 0

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: PGP signature

[PATCH v2 6/8] MINOR: uri_normalizer: Add support for supressing leading `../` for dotdot normalizer

2021-04-15 Thread Tim Duesterhus

Christopher,

the general logic of the normalizer is unchanged, but the whole framework was
refactored.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This adds an option to supress `../` at the start of the resulting path.
---
 doc/configuration.txt  | 10 +-
 include/haproxy/action-t.h |  1 +
 include/haproxy/uri_normalizer.h   |  2 +-
 reg-tests/http-rules/normalize_uri.vtc | 16 
 src/http_act.c | 17 ++---
 src/uri_normalizer.c   | 25 -
 6 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index a6110eb19..a073da5fe 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -6012,7 +6012,7 @@ http-request early-hint   [ { if | unless } 
 ]
   See RFC 8297 for more information.
 
 http-request normalize-uri  [ { if | unless }  ]
-http-request normalize-uri dotdot [ { if | unless }  ]
+http-request normalize-uri dotdot [ full ] [ { if | unless }  ]
 http-request normalize-uri merge-slashes [ { if | unless }  ]
 
   Performs normalization of the request's URI. The following normalizers are
@@ -6028,8 +6028,16 @@ http-request normalize-uri merge-slashes [ { if | unless 
}  ]
   - /foo/../bar/ -> /bar/
   - /foo/bar/../ -> /foo/
   - /../bar/ -> /../bar/
+  - /bar/../../  -> /../
   - /foo//../-> /foo/
 
+  If the "full" option is specified then "../" at the beginning will be
+  removed as well:
+
+  Example:
+  - /../bar/ -> /bar/
+  - /bar/../../  -> /
+
   - merge-slashes: Merges adjacent slashes within the "path" component into a
   single slash.
 
diff --git a/include/haproxy/action-t.h b/include/haproxy/action-t.h
index ac9399a6b..5a8155929 100644
--- a/include/haproxy/action-t.h
+++ b/include/haproxy/action-t.h
@@ -104,6 +104,7 @@ enum act_timeout_name {
 enum act_normalize_uri {
ACT_NORMALIZE_URI_MERGE_SLASHES,
ACT_NORMALIZE_URI_DOTDOT,
+   ACT_NORMALIZE_URI_DOTDOT_FULL,
 };
 
 /* NOTE: if <.action_ptr> is defined, the referenced function will always be
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
index 9dbbe5826..811a7ebb6 100644
--- a/include/haproxy/uri_normalizer.h
+++ b/include/haproxy/uri_normalizer.h
@@ -18,7 +18,7 @@
 
 #include 
 
-enum uri_normalizer_err uri_normalizer_path_dotdot(const struct ist path, 
struct ist *dst);
+enum uri_normalizer_err uri_normalizer_path_dotdot(const struct ist path, int 
full, struct ist *dst);
 enum uri_normalizer_err uri_normalizer_path_merge_slashes(const struct ist 
path, struct ist *dst);
 
 #endif /* _HAPROXY_URI_NORMALIZER_H */
diff --git a/reg-tests/http-rules/normalize_uri.vtc 
b/reg-tests/http-rules/normalize_uri.vtc
index e66bdc47b..5ee73a308 100644
--- a/reg-tests/http-rules/normalize_uri.vtc
+++ b/reg-tests/http-rules/normalize_uri.vtc
@@ -36,8 +36,13 @@ haproxy h1 -conf {
 http-request normalize-uri dotdot
 http-request set-var(txn.after) url
 
+http-request set-uri %[var(txn.before)]
+http-request normalize-uri dotdot full
+http-request set-var(txn.after_full) url
+
 http-response add-header before  %[var(txn.before)]
 http-response add-header after  %[var(txn.after)]
+http-response add-header after-full  %[var(txn.after_full)]
 
 default_backend be
 
@@ -103,54 +108,65 @@ client c2 -connect ${h1_fe_dotdot_sock} {
 rxresp
 expect resp.http.before == "/foo/bar"
 expect resp.http.after == "/foo/bar"
+expect resp.http.after-full == "/foo/bar"
 
 txreq -url "/foo/.."
 rxresp
 expect resp.http.before == "/foo/.."
 expect resp.http.after == "/"
+expect resp.http.after-full == "/"
 
 txreq -url "/foo/../"
 rxresp
 expect resp.http.before == "/foo/../"
 expect resp.http.after == "/"
+expect resp.http.after-full == "/"
 
 txreq -url "/foo/bar/../"
 rxresp
 expect resp.http.before == "/foo/bar/../"
 expect resp.http.after == "/foo/"
+expect resp.http.after-full == "/foo/"
 
 txreq -url "/foo/../bar"
 rxresp
 expect resp.http.before == "/foo/../bar"
 expect resp.http.after == "/bar"
+expect resp.http.after-full == "/bar"
 
 txreq -url "/foo/../bar/"
 rxresp
 expect resp.http.before == "/foo/../bar/"
 expect resp.http.after == "/bar/"
+expect resp.http.after-full == "/bar/"
 
 txreq -url "/foo/../../bar/"
 rxresp
 expect resp.http.before == "/foo/../../bar/"
 expect resp.http.after == "/../bar/"
+expect resp.http.after-full == "/bar/"
 
 txreq -url "/foo//../../bar/"
 rxresp
 expect resp.http.before == "/foo//../../bar/"
 expect resp.http.after == "/bar/"
+expect resp.http.after-full == "/bar/"
 
 txreq -url "/foo/?bar=/foo/../"
 rxresp
 expect resp.http.bef

[PATCH v2 0/8] URI normalization / Issue #714

2021-04-15 Thread Tim Duesterhus

Christopher,

again: Thank you for the very helpful review. In this series you can find v2 of
my URI normalization patches. I hope I did not forget to incorporate any of
your feedback.

Some general notes:

- I completely cleaned up the commit history to group similar patches (e.g. the
  two patches for dotdot) and to avoid later commits completely refactoring
  earlier commits (e.g. the error handling).
- As suggested I am now returning the error code and taking a `struct ist *dst`.
- The values of `enum uri_normalizer_err` are cleaned up.
- I cleaned up the error handling in `http_action_normalize_uri`.
- I am now using `istdiff()`.
- Dynamic allocation is no more.
- I fixed some parts of the code style (`struct ist* foo` -> `struct ist *foo`).
- I const'ified as much as possible.

Tim Düsterhus (8):
  MINOR: uri_normalizer: Add uri_normalizer module
  MINOR: uri_normalizer: Add `enum uri_normalizer_err`
  MINOR: uri_normalizer: Add `http-request normalize-uri`
  MINOR: uri_normalizer: Add a `merge-slashes` normalizer to
http-request normalize-uri
  MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request
normalize-uri
  MINOR: uri_normalizer: Add support for supressing leading `../` for
dotdot normalizer
  MINOR: uri_normalizer: Add a `sort-query` normalizer
  MINOR: uri_normalizer: Add a `percent-upper` normalizer

 Makefile   |   2 +-
 doc/configuration.txt  |  58 +
 include/haproxy/action-t.h |   9 +
 include/haproxy/uri_normalizer-t.h |  31 +++
 include/haproxy/uri_normalizer.h   |  33 +++
 reg-tests/http-rules/normalize_uri.vtc | 314 +
 src/http_act.c | 201 
 src/uri_normalizer.c   | 296 +++
 8 files changed, 943 insertions(+), 1 deletion(-)
 create mode 100644 include/haproxy/uri_normalizer-t.h
 create mode 100644 include/haproxy/uri_normalizer.h
 create mode 100644 reg-tests/http-rules/normalize_uri.vtc
 create mode 100644 src/uri_normalizer.c

-- 
2.31.1

[PATCH v2 7/8] MINOR: uri_normalizer: Add a `sort-query` normalizer

2021-04-15 Thread Tim Duesterhus

Christopher,

here I am using `istdiff` and a trash buffer instead of dynamic allocation.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This normalizer sorts the `&` delimited query parameters by parameter name.

See GitHub Issue #714.
---
 doc/configuration.txt  | 10 +++
 include/haproxy/action-t.h |  1 +
 include/haproxy/uri_normalizer.h   |  1 +
 reg-tests/http-rules/normalize_uri.vtc | 81 ++-
 src/http_act.c | 22 +++
 src/uri_normalizer.c   | 89 ++
 6 files changed, 203 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index a073da5fe..862427e51 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -6014,6 +6014,7 @@ http-request early-hint   [ { if | unless } 
 ]
 http-request normalize-uri  [ { if | unless }  ]
 http-request normalize-uri dotdot [ full ] [ { if | unless }  ]
 http-request normalize-uri merge-slashes [ { if | unless }  ]
+http-request normalize-uri sort-query [ { if | unless }  ]
 
   Performs normalization of the request's URI. The following normalizers are
   available:
@@ -6045,6 +6046,15 @@ http-request normalize-uri merge-slashes [ { if | unless 
}  ]
   - //-> /
   - /foo//bar -> /foo/bar
 
+  - sort-query: Sorts the query string parameters by parameter name.
+  Parameters are assumed to be delimited by '&'. Shorter names sort before
+  longer names and identical parameter names maintain their relative order.
+
+  Example:
+  - /?c=3&a=1&b=2 -> /?a=1&b=2&c=3
+  - /?aaa=3&a=1&aa=2  -> /?a=1&aa=2&aaa=3
+  - /?a=3&b=4&a=1&b=5&a=2 -> /?a=3&a=1&a=2&b=4&b=5
+
 http-request redirect  [ { if | unless }  ]
 
   This performs an HTTP redirection based on a redirect rule. This is exactly
diff --git a/include/haproxy/action-t.h b/include/haproxy/action-t.h
index 5a8155929..ae43a936d 100644
--- a/include/haproxy/action-t.h
+++ b/include/haproxy/action-t.h
@@ -105,6 +105,7 @@ enum act_normalize_uri {
ACT_NORMALIZE_URI_MERGE_SLASHES,
ACT_NORMALIZE_URI_DOTDOT,
ACT_NORMALIZE_URI_DOTDOT_FULL,
+   ACT_NORMALIZE_URI_SORT_QUERY,
 };
 
 /* NOTE: if <.action_ptr> is defined, the referenced function will always be
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
index 811a7ebb6..c16dd3ffa 100644
--- a/include/haproxy/uri_normalizer.h
+++ b/include/haproxy/uri_normalizer.h
@@ -20,6 +20,7 @@
 
 enum uri_normalizer_err uri_normalizer_path_dotdot(const struct ist path, int 
full, struct ist *dst);
 enum uri_normalizer_err uri_normalizer_path_merge_slashes(const struct ist 
path, struct ist *dst);
+enum uri_normalizer_err uri_normalizer_query_sort(const struct ist query, 
const char delim, struct ist *dst);
 
 #endif /* _HAPROXY_URI_NORMALIZER_H */
 
diff --git a/reg-tests/http-rules/normalize_uri.vtc 
b/reg-tests/http-rules/normalize_uri.vtc
index 5ee73a308..cb3fa2f63 100644
--- a/reg-tests/http-rules/normalize_uri.vtc
+++ b/reg-tests/http-rules/normalize_uri.vtc
@@ -8,7 +8,7 @@ feature ignore_unknown_macro
 server s1 {
 rxreq
 txresp
-} -repeat 21 -start
+} -repeat 34 -start
 
 haproxy h1 -conf {
 defaults
@@ -46,6 +46,18 @@ haproxy h1 -conf {
 
 default_backend be
 
+frontend fe_sort_query
+bind "fd@${fe_sort_query}"
+
+http-request set-var(txn.before) url
+http-request normalize-uri sort-query
+http-request set-var(txn.after) url
+
+http-response add-header before  %[var(txn.before)]
+http-response add-header after  %[var(txn.after)]
+
+default_backend be
+
 backend be
 server s1 ${s1_addr}:${s1_port}
 
@@ -170,3 +182,70 @@ client c2 -connect ${h1_fe_dotdot_sock} {
 expect resp.http.after == "*"
 expect resp.http.after-full == "*"
 } -run
+
+client c3 -connect ${h1_fe_sort_query_sock} {
+txreq -url "/?a=a"
+rxresp
+expect resp.http.before == "/?a=a"
+expect resp.http.after == "/?a=a"
+
+txreq -url "/?a=a&z=z"
+rxresp
+expect resp.http.before == "/?a=a&z=z"
+expect resp.http.after == "/?a=a&z=z"
+
+txreq -url "/?z=z&a=a"
+rxresp
+expect resp.http.before == "/?z=z&a=a"
+expect resp.http.after == "/?a=a&z=z"
+
+txreq -url "/?a=z&z=a"
+rxresp
+expect resp.http.before == "/?a=z&z=a"
+expect resp.http.after == "/?a=z&z=a"
+
+txreq -url "/?z=a&a=z"
+rxresp
+expect resp.http.before == "/?z=a&a=z"
+expect resp.http.after == "/?a=z&z=a"
+
+txreq -url "/?c&b&a&z&x&y"
+rxresp
+expect resp.http.before == "/?c&b&a&z&x&y"
+expect resp.http.after == "/?a&b&c&x&y&z"
+
+txreq -url "/?a=&aa=&aaa=&="
+rxresp
+expect resp.http.before == "/?a=&aa=&aaa=&="
+expect resp.http.after == "/?a=&aa=&aaa=&="
+
+txreq -url "/?=&a=&aa=&aaa="
+rxresp
+expect r

[PATCH v2 8/8] MINOR: uri_normalizer: Add a `percent-upper` normalizer

2021-04-15 Thread Tim Duesterhus

Christopher,

the general logic of the normalizer is unchanged, but the whole framework was
refactored.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This normalizer uppercases the hexadecimal characters used in percent-encoding.

See GitHub Issue #714.
---
 doc/configuration.txt  | 14 ++
 include/haproxy/action-t.h |  2 +
 include/haproxy/uri_normalizer.h   |  1 +
 reg-tests/http-rules/normalize_uri.vtc | 65 +-
 src/http_act.c | 33 +
 src/uri_normalizer.c   | 58 +++
 6 files changed, 172 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 862427e51..1e2f72b61 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -6014,6 +6014,7 @@ http-request early-hint   [ { if | unless } 
 ]
 http-request normalize-uri  [ { if | unless }  ]
 http-request normalize-uri dotdot [ full ] [ { if | unless }  ]
 http-request normalize-uri merge-slashes [ { if | unless }  ]
+http-request normalize-uri percent-upper [ strict ] [ { if | unless } 
 ]
 http-request normalize-uri sort-query [ { if | unless }  ]
 
   Performs normalization of the request's URI. The following normalizers are
@@ -6046,6 +6047,19 @@ http-request normalize-uri sort-query [ { if | unless } 
 ]
   - //-> /
   - /foo//bar -> /foo/bar
 
+  - percent-upper: Uppercases letters within percent-encoded sequences
+  (RFC 3986#6.2.21).
+
+  Example:
+  - /%6f -> /%6F
+  - /%zz -> /%zz
+
+  If the "strict" option is specified then invalid sequences will result
+  in a HTTP 400 Bad Request being returned.
+
+  Example:
+  - /%zz -> HTTP 400
+
   - sort-query: Sorts the query string parameters by parameter name.
   Parameters are assumed to be delimited by '&'. Shorter names sort before
   longer names and identical parameter names maintain their relative order.
diff --git a/include/haproxy/action-t.h b/include/haproxy/action-t.h
index ae43a936d..cce2a2e23 100644
--- a/include/haproxy/action-t.h
+++ b/include/haproxy/action-t.h
@@ -106,6 +106,8 @@ enum act_normalize_uri {
ACT_NORMALIZE_URI_DOTDOT,
ACT_NORMALIZE_URI_DOTDOT_FULL,
ACT_NORMALIZE_URI_SORT_QUERY,
+   ACT_NORMALIZE_URI_PERCENT_UPPER,
+   ACT_NORMALIZE_URI_PERCENT_UPPER_STRICT,
 };
 
 /* NOTE: if <.action_ptr> is defined, the referenced function will always be
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
index c16dd3ffa..180936eae 100644
--- a/include/haproxy/uri_normalizer.h
+++ b/include/haproxy/uri_normalizer.h
@@ -18,6 +18,7 @@
 
 #include 
 
+enum uri_normalizer_err uri_normalizer_percent_upper(const struct ist input, 
int strict, struct ist *dst);
 enum uri_normalizer_err uri_normalizer_path_dotdot(const struct ist path, int 
full, struct ist *dst);
 enum uri_normalizer_err uri_normalizer_path_merge_slashes(const struct ist 
path, struct ist *dst);
 enum uri_normalizer_err uri_normalizer_query_sort(const struct ist query, 
const char delim, struct ist *dst);
diff --git a/reg-tests/http-rules/normalize_uri.vtc 
b/reg-tests/http-rules/normalize_uri.vtc
index cb3fa2f63..e900677e9 100644
--- a/reg-tests/http-rules/normalize_uri.vtc
+++ b/reg-tests/http-rules/normalize_uri.vtc
@@ -8,7 +8,7 @@ feature ignore_unknown_macro
 server s1 {
 rxreq
 txresp
-} -repeat 34 -start
+} -repeat 43 -start
 
 haproxy h1 -conf {
 defaults
@@ -58,6 +58,30 @@ haproxy h1 -conf {
 
 default_backend be
 
+frontend fe_percent_upper
+bind "fd@${fe_percent_upper}"
+
+http-request set-var(txn.before) url
+http-request normalize-uri percent-upper
+http-request set-var(txn.after) url
+
+http-response add-header before  %[var(txn.before)]
+http-response add-header after  %[var(txn.after)]
+
+default_backend be
+
+frontend fe_percent_upper_strict
+bind "fd@${fe_percent_upper_strict}"
+
+http-request set-var(txn.before) url
+http-request normalize-uri percent-upper strict
+http-request set-var(txn.after) url
+
+http-response add-header before  %[var(txn.before)]
+http-response add-header after  %[var(txn.after)]
+
+default_backend be
+
 backend be
 server s1 ${s1_addr}:${s1_port}
 
@@ -249,3 +273,42 @@ client c3 -connect ${h1_fe_sort_query_sock} {
 expect resp.http.before == "*"
 expect resp.http.after == "*"
 } -run
+
+client c4 -connect ${h1_fe_percent_upper_sock} {
+txreq -url "/a?a=a"
+rxresp
+expect resp.http.before == "/a?a=a"
+expect resp.http.after == "/a?a=a"
+
+txreq -url "/%aa?a=%aa"
+rxresp
+expect resp.http.before == "/%aa?a=%aa"
+expect resp.http.after == "/%AA?a=%AA"
+
+txreq -url "/%zz?a=%zz"
+rxresp
+expect resp.status == 200
+expect resp.http.befor

[PATCH v2 5/8] MINOR: uri_normalizer: Add a `dotdot` normalizer to http-request normalize-uri

2021-04-15 Thread Tim Duesterhus

Christopher,

the general logic of the normalizer is unchanged, but the whole framework was
refactored.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This normalizer merges `../` path segments with the predecing segment, removing
both the preceding segment and the `../`.

Empty segments do not receive special treatment. The `merge-slashes` normalizer
should be executed first.

See GitHub Issue #714.
---
 doc/configuration.txt  | 13 
 include/haproxy/action-t.h |  1 +
 include/haproxy/uri_normalizer.h   |  1 +
 reg-tests/http-rules/normalize_uri.vtc | 71 +-
 src/http_act.c | 22 +++
 src/uri_normalizer.c   | 82 ++
 6 files changed, 189 insertions(+), 1 deletion(-)

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 61cb0b5ad..a6110eb19 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -6012,11 +6012,24 @@ http-request early-hint   [ { if | unless } 
 ]
   See RFC 8297 for more information.
 
 http-request normalize-uri  [ { if | unless }  ]
+http-request normalize-uri dotdot [ { if | unless }  ]
 http-request normalize-uri merge-slashes [ { if | unless }  ]
 
   Performs normalization of the request's URI. The following normalizers are
   available:
 
+  - dotdot: Normalizes "/../" segments within the "path" component. This merges
+  segments that attempt to access the parent directory with their preceding
+  segment. Empty segments do not receive special treatment. Use the
+  "merge-slashes" normalizer first if this is undesired.
+
+  Example:
+  - /foo/../ -> /
+  - /foo/../bar/ -> /bar/
+  - /foo/bar/../ -> /foo/
+  - /../bar/ -> /../bar/
+  - /foo//../-> /foo/
+
   - merge-slashes: Merges adjacent slashes within the "path" component into a
   single slash.
 
diff --git a/include/haproxy/action-t.h b/include/haproxy/action-t.h
index 4a3e3f8bd..ac9399a6b 100644
--- a/include/haproxy/action-t.h
+++ b/include/haproxy/action-t.h
@@ -103,6 +103,7 @@ enum act_timeout_name {
 
 enum act_normalize_uri {
ACT_NORMALIZE_URI_MERGE_SLASHES,
+   ACT_NORMALIZE_URI_DOTDOT,
 };
 
 /* NOTE: if <.action_ptr> is defined, the referenced function will always be
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
index 416f5b7c5..9dbbe5826 100644
--- a/include/haproxy/uri_normalizer.h
+++ b/include/haproxy/uri_normalizer.h
@@ -18,6 +18,7 @@
 
 #include 
 
+enum uri_normalizer_err uri_normalizer_path_dotdot(const struct ist path, 
struct ist *dst);
 enum uri_normalizer_err uri_normalizer_path_merge_slashes(const struct ist 
path, struct ist *dst);
 
 #endif /* _HAPROXY_URI_NORMALIZER_H */
diff --git a/reg-tests/http-rules/normalize_uri.vtc 
b/reg-tests/http-rules/normalize_uri.vtc
index 3303760d4..e66bdc47b 100644
--- a/reg-tests/http-rules/normalize_uri.vtc
+++ b/reg-tests/http-rules/normalize_uri.vtc
@@ -8,7 +8,7 @@ feature ignore_unknown_macro
 server s1 {
 rxreq
 txresp
-} -repeat 10 -start
+} -repeat 21 -start
 
 haproxy h1 -conf {
 defaults
@@ -29,6 +29,18 @@ haproxy h1 -conf {
 
 default_backend be
 
+frontend fe_dotdot
+bind "fd@${fe_dotdot}"
+
+http-request set-var(txn.before) url
+http-request normalize-uri dotdot
+http-request set-var(txn.after) url
+
+http-response add-header before  %[var(txn.before)]
+http-response add-header after  %[var(txn.after)]
+
+default_backend be
+
 backend be
 server s1 ${s1_addr}:${s1_port}
 
@@ -85,3 +97,60 @@ client c1 -connect ${h1_fe_merge_slashes_sock} {
 expect resp.http.before == "*"
 expect resp.http.after == "*"
 } -run
+
+client c2 -connect ${h1_fe_dotdot_sock} {
+txreq -url "/foo/bar"
+rxresp
+expect resp.http.before == "/foo/bar"
+expect resp.http.after == "/foo/bar"
+
+txreq -url "/foo/.."
+rxresp
+expect resp.http.before == "/foo/.."
+expect resp.http.after == "/"
+
+txreq -url "/foo/../"
+rxresp
+expect resp.http.before == "/foo/../"
+expect resp.http.after == "/"
+
+txreq -url "/foo/bar/../"
+rxresp
+expect resp.http.before == "/foo/bar/../"
+expect resp.http.after == "/foo/"
+
+txreq -url "/foo/../bar"
+rxresp
+expect resp.http.before == "/foo/../bar"
+expect resp.http.after == "/bar"
+
+txreq -url "/foo/../bar/"
+rxresp
+expect resp.http.before == "/foo/../bar/"
+expect resp.http.after == "/bar/"
+
+txreq -url "/foo/../../bar/"
+rxresp
+expect resp.http.before == "/foo/../../bar/"
+expect resp.http.after == "/../bar/"
+
+txreq -url "/foo//../../bar/"
+rxresp
+expect resp.http.before == "/foo//../../bar/"
+expect resp.http.after == "/bar/"
+
+txreq -url "/foo/?bar=/foo/../"
+rxresp
+expect resp.http.before == "/foo/?bar=/foo/../"

[PATCH v2 3/8] MINOR: uri_normalizer: Add `http-request normalize-uri`

2021-04-15 Thread Tim Duesterhus

Christopher,

this one is completely new. It cleanly separates the patches for adding the
action from the patches adding the normalizers.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This patch adds the `http-request normalize-uri` action that was requested in
GitHub issue #714.

Normalizers will be added in the next patches.
---
 include/haproxy/action-t.h |  4 ++
 src/http_act.c | 96 ++
 2 files changed, 100 insertions(+)

diff --git a/include/haproxy/action-t.h b/include/haproxy/action-t.h
index 9009e4aae..2909b0da2 100644
--- a/include/haproxy/action-t.h
+++ b/include/haproxy/action-t.h
@@ -101,6 +101,10 @@ enum act_timeout_name {
ACT_TIMEOUT_TUNNEL,
 };
 
+enum act_normalize_uri {
+   ACT_NORMALIZE_URI_PLACEHOLDER,
+};
+
 /* NOTE: if <.action_ptr> is defined, the referenced function will always be
  *   called regardless the action type. */
 struct act_rule {
diff --git a/src/http_act.c b/src/http_act.c
index c699671a3..134c9037b 100644
--- a/src/http_act.c
+++ b/src/http_act.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 
@@ -194,6 +195,100 @@ static enum act_parse_ret parse_set_req_line(const char 
**args, int *orig_arg, s
return ACT_RET_PRS_OK;
 }
 
+/* This function executes the http-request normalize-uri action.
+ * `rule->action` is expected to be a value from `enum act_normalize_uri`.
+ *
+ * On success, it returns ACT_RET_CONT. If an error
+ * occurs while soft rewrites are enabled, the action is canceled, but the rule
+ * processing continue. Otherwsize ACT_RET_ERR is returned.
+ */
+static enum act_return http_action_normalize_uri(struct act_rule *rule, struct 
proxy *px,
+ struct session *sess, struct 
stream *s, int flags)
+{
+   enum act_return ret = ACT_RET_CONT;
+   struct htx *htx = htxbuf(&s->req.buf);
+   const struct ist uri = htx_sl_req_uri(http_get_stline(htx));
+   struct buffer *replace = alloc_trash_chunk();
+   enum uri_normalizer_err err = URI_NORMALIZER_ERR_INTERNAL_ERROR;
+
+   if (!replace)
+   goto fail_alloc;
+
+   switch ((enum act_normalize_uri) rule->action) {
+   case ACT_NORMALIZE_URI_PLACEHOLDER:
+   (void) uri;
+   }
+
+   switch (err) {
+   case URI_NORMALIZER_ERR_NONE:
+   break;
+   case URI_NORMALIZER_ERR_INTERNAL_ERROR:
+   ret = ACT_RET_ERR;
+   break;
+   case URI_NORMALIZER_ERR_INVALID_INPUT:
+   ret = ACT_RET_INV;
+   break;
+   case URI_NORMALIZER_ERR_ALLOC:
+   goto fail_alloc;
+   }
+
+  leave:
+   free_trash_chunk(replace);
+   return ret;
+
+  fail_alloc:
+   if (!(s->flags & SF_ERR_MASK))
+   s->flags |= SF_ERR_RESOURCE;
+   ret = ACT_RET_ERR;
+   goto leave;
+
+  fail_rewrite:
+   _HA_ATOMIC_ADD(&sess->fe->fe_counters.failed_rewrites, 1);
+   if (s->flags & SF_BE_ASSIGNED)
+   _HA_ATOMIC_ADD(&s->be->be_counters.failed_rewrites, 1);
+   if (sess->listener && sess->listener->counters)
+   _HA_ATOMIC_ADD(&sess->listener->counters->failed_rewrites, 1);
+   if (objt_server(s->target))
+   
_HA_ATOMIC_ADD(&__objt_server(s->target)->counters.failed_rewrites, 1);
+
+   if (!(s->txn->req.flags & HTTP_MSGF_SOFT_RW)) {
+   ret = ACT_RET_ERR;
+   if (!(s->flags & SF_ERR_MASK))
+   s->flags |= SF_ERR_PRXCOND;
+   }
+   goto leave;
+}
+
+/* Parses the http-request normalize-uri action. It expects a single 

+ * argument, corresponding too a value in `enum act_normalize_uri`.
+ *
+ * It returns ACT_RET_PRS_OK on success, ACT_RET_PRS_ERR on error.
+ */
+static enum act_parse_ret parse_http_normalize_uri(const char **args, int 
*orig_arg, struct proxy *px,
+   struct act_rule *rule, char 
**err)
+{
+   int cur_arg = *orig_arg;
+
+   rule->action_ptr = http_action_normalize_uri;
+   rule->release_ptr = NULL;
+
+   if (!*args[cur_arg]) {
+   memprintf(err, "missing argument ");
+   return ACT_RET_PRS_ERR;
+   }
+
+   if (0) {
+
+   }
+   else {
+   memprintf(err, "unknown normalizer '%s'", args[cur_arg]);
+   return ACT_RET_PRS_ERR;
+   }
+
+   *orig_arg = cur_arg;
+   return ACT_RET_PRS_OK;
+}
+
 /* This function executes a replace-uri action. It finds its arguments in
  * .arg.http. It builds a string in the trash from the format string
  * previously filled by function parse_replace_uri() and will execute the regex
@@ -2194,6 +2289,7 @@ static struct action_kw_list http_req_actions = {
{ "deny", parse_http_deny, 0 },
{ "disable-l7-retry", parse_http

[PATCH v2 4/8] MINOR: uri_normalizer: Add a `merge-slashes` normalizer to http-request normalize-uri

2021-04-15 Thread Tim Duesterhus

Christopher,

the general logic of the normalizer is unchanged, but the whole framework was
refactored.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This normalizer merges adjacent slashes into a single slash, thus removing
empty path segments.

See GitHub Issue #714.
---
 doc/configuration.txt  | 13 
 include/haproxy/action-t.h |  2 +-
 include/haproxy/uri_normalizer.h   |  4 ++
 reg-tests/http-rules/normalize_uri.vtc | 87 ++
 src/http_act.c | 23 ++-
 src/uri_normalizer.c   | 39 
 6 files changed, 164 insertions(+), 4 deletions(-)
 create mode 100644 reg-tests/http-rules/normalize_uri.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index 61c2a6dd9..61cb0b5ad 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -6011,6 +6011,19 @@ http-request early-hint   [ { if | unless } 
 ]
 
   See RFC 8297 for more information.
 
+http-request normalize-uri  [ { if | unless }  ]
+http-request normalize-uri merge-slashes [ { if | unless }  ]
+
+  Performs normalization of the request's URI. The following normalizers are
+  available:
+
+  - merge-slashes: Merges adjacent slashes within the "path" component into a
+  single slash.
+
+  Example:
+  - //-> /
+  - /foo//bar -> /foo/bar
+
 http-request redirect  [ { if | unless }  ]
 
   This performs an HTTP redirection based on a redirect rule. This is exactly
diff --git a/include/haproxy/action-t.h b/include/haproxy/action-t.h
index 2909b0da2..4a3e3f8bd 100644
--- a/include/haproxy/action-t.h
+++ b/include/haproxy/action-t.h
@@ -102,7 +102,7 @@ enum act_timeout_name {
 };
 
 enum act_normalize_uri {
-   ACT_NORMALIZE_URI_PLACEHOLDER,
+   ACT_NORMALIZE_URI_MERGE_SLASHES,
 };
 
 /* NOTE: if <.action_ptr> is defined, the referenced function will always be
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
index 20341a907..416f5b7c5 100644
--- a/include/haproxy/uri_normalizer.h
+++ b/include/haproxy/uri_normalizer.h
@@ -14,8 +14,12 @@
 #ifndef _HAPROXY_URI_NORMALIZER_H
 #define _HAPROXY_URI_NORMALIZER_H
 
+#include 
+
 #include 
 
+enum uri_normalizer_err uri_normalizer_path_merge_slashes(const struct ist 
path, struct ist *dst);
+
 #endif /* _HAPROXY_URI_NORMALIZER_H */
 
 /*
diff --git a/reg-tests/http-rules/normalize_uri.vtc 
b/reg-tests/http-rules/normalize_uri.vtc
new file mode 100644
index 0..3303760d4
--- /dev/null
+++ b/reg-tests/http-rules/normalize_uri.vtc
@@ -0,0 +1,87 @@
+varnishtest "normalize-uri tests"
+#REQUIRE_VERSION=2.4
+
+# This reg-test tests the http-request normalize-uri action.
+
+feature ignore_unknown_macro
+
+server s1 {
+rxreq
+txresp
+} -repeat 10 -start
+
+haproxy h1 -conf {
+defaults
+mode http
+timeout connect 1s
+timeout client  1s
+timeout server  1s
+
+frontend fe_merge_slashes
+bind "fd@${fe_merge_slashes}"
+
+http-request set-var(txn.before) url
+http-request normalize-uri merge-slashes
+http-request set-var(txn.after) url
+
+http-response add-header before  %[var(txn.before)]
+http-response add-header after  %[var(txn.after)]
+
+default_backend be
+
+backend be
+server s1 ${s1_addr}:${s1_port}
+
+} -start
+
+client c1 -connect ${h1_fe_merge_slashes_sock} {
+txreq -url "/foo/bar"
+rxresp
+expect resp.http.before == "/foo/bar"
+expect resp.http.after == "/foo/bar"
+
+txreq -url "/foo//bar"
+rxresp
+expect resp.http.before == "/foo//bar"
+expect resp.http.after == "/foo/bar"
+
+txreq -url "/foo///bar"
+rxresp
+expect resp.http.before == "/foo///bar"
+expect resp.http.after == "/foo/bar"
+
+txreq -url "///foo///bar"
+rxresp
+expect resp.http.before == "///foo///bar"
+expect resp.http.after == "/foo/bar"
+
+txreq -url "///foo/bar"
+rxresp
+expect resp.http.before == "///foo/bar"
+expect resp.http.after == "/foo/bar"
+
+txreq -url "///foo///bar///"
+rxresp
+expect resp.http.before == "///foo///bar///"
+expect resp.http.after == "/foo/bar/"
+
+txreq -url "///"
+rxresp
+expect resp.http.before == "///"
+expect resp.http.after == "/"
+
+txreq -url "/foo?bar=///"
+rxresp
+expect resp.http.before == "/foo?bar=///"
+expect resp.http.after == "/foo?bar=///"
+
+txreq -url "//foo?bar=///"
+rxresp
+expect resp.http.before == "//foo?bar=///"
+expect resp.http.after == "/foo?bar=///"
+
+txreq -req OPTIONS -url "*"
+rxresp
+expect resp.http.before == "*"
+expect resp.http.after == "*"
+} -run
diff --git a/src/http_act.c b/src/http_act.c
index 134c9037b..2af4d471a 100644
--- a/src/http_act.c
+++ b/src/http_act.c
@@ -215,8 +215,23 @@ static enum act_return http_action_normalize_uri(struct 
act_rule *rule, struc

[PATCH v2 2/8] MINOR: uri_normalizer: Add `enum uri_normalizer_err`

2021-04-15 Thread Tim Duesterhus

Christopher,

in this one I cleaned up the values of the enum.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This enum will serve as the return type for each normalizer.
---
 include/haproxy/uri_normalizer-t.h | 31 ++
 include/haproxy/uri_normalizer.h   |  2 ++
 2 files changed, 33 insertions(+)
 create mode 100644 include/haproxy/uri_normalizer-t.h

diff --git a/include/haproxy/uri_normalizer-t.h 
b/include/haproxy/uri_normalizer-t.h
new file mode 100644
index 0..bcbcaef8e
--- /dev/null
+++ b/include/haproxy/uri_normalizer-t.h
@@ -0,0 +1,31 @@
+/*
+ * include/haproxy/uri_normalizer.h
+ * HTTP request URI normalization.
+ *
+ * Copyright 2021 Tim Duesterhus 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+
+#ifndef _HAPROXY_URI_NORMALIZER_T_H
+#define _HAPROXY_URI_NORMALIZER_T_H
+
+enum uri_normalizer_err {
+   URI_NORMALIZER_ERR_NONE = 0,
+   URI_NORMALIZER_ERR_ALLOC,
+   URI_NORMALIZER_ERR_INVALID_INPUT,
+   URI_NORMALIZER_ERR_INTERNAL_ERROR = 0xdead,
+};
+
+#endif /* _HAPROXY_URI_NORMALIZER_T_H */
+
+/*
+ * Local variables:
+ *  c-indent-level: 8
+ *  c-basic-offset: 8
+ * End:
+ */
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
index 82ef97324..20341a907 100644
--- a/include/haproxy/uri_normalizer.h
+++ b/include/haproxy/uri_normalizer.h
@@ -14,6 +14,8 @@
 #ifndef _HAPROXY_URI_NORMALIZER_H
 #define _HAPROXY_URI_NORMALIZER_H
 
+#include 
+
 #endif /* _HAPROXY_URI_NORMALIZER_H */
 
 /*
-- 
2.31.1

[PATCH v2 1/8] MINOR: uri_normalizer: Add uri_normalizer module

2021-04-15 Thread Tim Duesterhus

Christopher,

this one should be unchanged. I just fixed the conflict with Aleks' JSON patch.

Best regards
Tim Düsterhus

Apply with `git am --scissors` to automatically cut the commit message.

-- >8 --
This is in preparation for future patches.
---
 Makefile |  2 +-
 include/haproxy/uri_normalizer.h | 24 
 src/uri_normalizer.c | 21 +
 3 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 include/haproxy/uri_normalizer.h
 create mode 100644 src/uri_normalizer.c

diff --git a/Makefile b/Makefile
index 559248867..7b760ba51 100644
--- a/Makefile
+++ b/Makefile
@@ -884,7 +884,7 @@ OBJS += src/mux_h2.o src/mux_fcgi.o src/http_ana.o 
src/stream.o\
 src/hpack-enc.o src/hpack-huff.o src/ebtree.o src/base64.o 
\
 src/hash.o src/dgram.o src/version.o src/fix.o src/mqtt.o src/dns.o
\
 src/server_state.o src/proto_uxdg.o src/init.o src/cfgdiag.o   
\
-src/mjson.o
+src/mjson.o src/uri_normalizer.o
 
 ifneq ($(TRACE),)
 OBJS += src/calltrace.o
diff --git a/include/haproxy/uri_normalizer.h b/include/haproxy/uri_normalizer.h
new file mode 100644
index 0..82ef97324
--- /dev/null
+++ b/include/haproxy/uri_normalizer.h
@@ -0,0 +1,24 @@
+/*
+ * include/haproxy/uri_normalizer.h
+ * HTTP request URI normalization.
+ *
+ * Copyright 2021 Tim Duesterhus 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+
+#ifndef _HAPROXY_URI_NORMALIZER_H
+#define _HAPROXY_URI_NORMALIZER_H
+
+#endif /* _HAPROXY_URI_NORMALIZER_H */
+
+/*
+ * Local variables:
+ *  c-indent-level: 8
+ *  c-basic-offset: 8
+ * End:
+ */
diff --git a/src/uri_normalizer.c b/src/uri_normalizer.c
new file mode 100644
index 0..7db47d198
--- /dev/null
+++ b/src/uri_normalizer.c
@@ -0,0 +1,21 @@
+/*
+ * HTTP request URI normalization.
+ *
+ * Copyright 2021 Tim Duesterhus 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+
+/*
+ * Local variables:
+ *  c-indent-level: 8
+ *  c-basic-offset: 8
+ * End:
+ */
-- 
2.31.1

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Tim Düsterhus


Willy,

On 4/15/21 5:09 PM, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 04:49:00PM +0200, Aleksandar Lazic wrote:

#define JSON_INT_MAX ((1ULL << 53) - 1)

^
Sorry I was not clear, please drop that 'U' here.


I'm also sorry, I was in a tunnel :-/

Attached now the next patches.


Thank you! Now applied. I just fixed this remaining double indent issue
and that was all:



You know that I'm obsessed with the correct use of data types, so please 
find three CLEANUP patches attached. Feel free to drop the ones you 
don't agree with :-)


Aleks: Thanks for quickly addressing my feedback, even if you might 
think I was overly pedantic. The final version looks pretty good to me, 
my CLEANUP really is meant to be a final polishing!


Best regards
Tim Düsterhus
>From 1af53a6e73e75f48793720017fe07d0f57e8c74a Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:08:48 +0200
Subject: [PATCH 1/3] CLEANUP: sample: Improve local variables in
 sample_conv_json_query
To: haproxy@formilux.org
Cc: w...@1wt.eu

This improves the use of local variables in sample_conv_json_query:

- Use the enum type for the return value of `mjson_find`.
- Do not use single letter variables.
- Reduce the scope of variables that are only needed in a single branch.
- Add missing newlines after variable declaration.
---
 src/sample.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index 728c7c76d..a56d59245 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -3723,16 +3723,15 @@ static int sample_check_json_query(struct arg *arg, struct sample_conv *conv,
 static int sample_conv_json_query(const struct arg *args, struct sample *smp, void *private)
 {
 	struct buffer *trash = get_trash_chunk();
-	const char *p; /* holds the temporary string from mjson_find */
-	int tok, n;/* holds the token enum and the length of the value */
-	int rc;/* holds the return code from mjson_get_string */
+	const char *token; /* holds the temporary string from mjson_find */
+	int token_size;/* holds the length of  */
 
-	tok = mjson_find(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, &p, &n);
+	enum mjson_tok token_type = mjson_find(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, &token, &token_size);
 
-	switch(tok) {
+	switch (token_type) {
 		case MJSON_TOK_NUMBER:
 			if (args[1].type == ARGT_SINT) {
-smp->data.u.sint = atoll(p);
+smp->data.u.sint = atoll(token);
 
 if (smp->data.u.sint < JSON_INT_MIN || smp->data.u.sint > JSON_INT_MAX)
 	return 0;
@@ -3740,6 +3739,7 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 smp->data.type = SMP_T_SINT;
 			} else {
 double double_val;
+
 if (mjson_get_number(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, &double_val) == 0) {
 	return 0;
 } else {
@@ -3757,17 +3757,19 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 			smp->data.type = SMP_T_BOOL;
 			smp->data.u.sint = 0;
 			break;
-		case MJSON_TOK_STRING:
-			rc = mjson_get_string(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
-			if (rc == -1) {
+		case MJSON_TOK_STRING: {
+			int len = mjson_get_string(smp->data.u.str.area, smp->data.u.str.data, args[0].data.str.area, trash->area, trash->size);
+
+			if (len == -1) {
 /* invalid string */
 return 0;
 			} else {
-trash->data = rc;
+trash->data = len;
 smp->data.u.str = *trash;
 smp->data.type = SMP_T_STR;
 			}
 			break;
+		}
 		default:
 			/* no valid token found */
 			return 0;
-- 
2.31.1

>From 166ceb7c9a82f87f35580e3c379103d4d213fb18 Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:14:32 +0200
Subject: [PATCH 2/3] CLEANUP: sample: Explicitly handle all possible enum
 values from mjson
To: haproxy@formilux.org
Cc: w...@1wt.eu

This makes it easier to find bugs, because -Wswitch can help us.
---
 src/sample.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/sample.c b/src/sample.c
index a56d59245..3cc571560 100644
--- a/src/sample.c
+++ b/src/sample.c
@@ -3770,8 +3770,19 @@ static int sample_conv_json_query(const struct arg *args, struct sample *smp, vo
 			}
 			break;
 		}
-		default:
-			/* no valid token found */
+		case MJSON_TOK_NULL:
+		case MJSON_TOK_ARRAY:
+		case MJSON_TOK_OBJECT:
+			/* We cannot handle these. */
+			return 0;
+		case MJSON_TOK_INVALID:
+			/* Nothing matches the query. */
+			return 0;
+		case MJSON_TOK_KEY:
+			/* This is not a valid return value according to the
+			 * mjson documentation, but we handle it to benefit
+			 * from '-Wswitch'.
+			 */
 			return 0;
 	}
 	return 1;
-- 
2.31.1

>From 722b41673c0d31fa99583e75947451e47f506855 Mon Sep 17 00:00:00 2001
From: Tim Duesterhus 
Date: Thu, 15 Apr 2021 18:40:06 +0200
Subject: [PATCH 3/3] CLEANUP: sample

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic


On 15.04.21 17:09, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 04:49:00PM +0200, Aleksandar Lazic wrote:

#define JSON_INT_MAX ((1ULL << 53) - 1)

^
Sorry I was not clear, please drop that 'U' here.


I'm also sorry, I was in a tunnel :-/

Attached now the next patches.


Thank you! Now applied. I just fixed this remaining double indent issue
and that was all:


+   if (arg[1].data.str.data != 0) {
+   if (strcmp(arg[1].data.str.area, "int") != 0) {
+   memprintf(err, "output_type only supports \"int\" as 
argument");
+   return 0;
+   } else {
+   arg[1].type = ARGT_SINT;
+   arg[1].data.sint = 0;
+   }
+   }


Thanks Aleks! You see it wasn't that hard in the end :-)


Cool ;-) :-)

Now the Statement what I wanted to say ;-)

HAProxy have now at least 4 possibilities to route traffic and
catch some data.

HTTP fields
GRPC fields
FIX fields
JSON fields

Have I missed something?

I love this Project and the community.
Thanks willy and tim for your passion and precise reviews ;-)


Willy



Best regards
Aleks

Re: [PATCH] CI: travis-ci: enable graviton2 builds

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 07:19:25PM +0500,  ??? wrote:
> Hello,
> 
> as described in travis-ci docs:
> https://blog.travis-ci.com/2020-09-11-arm-on-aws
> 
> it is next generation armv8 available on AWS.

Applied, thanks Ilya!
Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 04:49:00PM +0200, Aleksandar Lazic wrote:
> > > #define JSON_INT_MAX ((1ULL << 53) - 1)
> >^
> > Sorry I was not clear, please drop that 'U' here.
> 
> I'm also sorry, I was in a tunnel :-/
> 
> Attached now the next patches.

Thank you! Now applied. I just fixed this remaining double indent issue
and that was all:

> + if (arg[1].data.str.data != 0) {
> + if (strcmp(arg[1].data.str.area, "int") != 0) {
> + memprintf(err, "output_type only supports 
> \"int\" as argument");
> + return 0;
> + } else {
> + arg[1].type = ARGT_SINT;
> + arg[1].data.sint = 0;
> + }
> + }

Thanks Aleks! You see it wasn't that hard in the end :-)
Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic


On 15.04.21 16:09, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 04:05:27PM +0200, Aleksandar Lazic wrote:

Well I don't think so because 4 is still bigger then -9007199254740991 ;-)


This is because *you* think it is -9007199254740991 but the reality
is that it's not this.due to ULL:

   #define JSON_INT_MAX ((1ULL << 53) - 1)
   #define JSON_INT_MIN (-JSON_INT_MAX)

=> it's -9007199254740991ULL hence 18437736874454810625 so 4 is
definitely not larger than this.



Never the less I have changed the defines and rerun the tests.
Btw, this vtest is a great enhancement to haproxy ;-)


Yes I totally agree. And you can't imagine how many times I'm angry
at it when it detects an error after a tiny change I make, just to
realize that I did really break something and that it was right :-)
Like all tools it just needs to be reasonably used, not excessively
trusted but used as a good hint that something unexpected changed,
and it helps a lot!



```
#define JSON_INT_MAX ((1ULL << 53) - 1)

   ^
Sorry I was not clear, please drop that 'U' here.


I'm also sorry, I was in a tunnel :-/

Attached now the next patches.


Willy



Regards
Aleks
>From 2f0673eb3e8a41e173221933021af2392d9a8ca4 Mon Sep 17 00:00:00 2001
From: Alex 
Date: Thu, 15 Apr 2021 16:45:15 +0200
Subject: [PATCH 2/2] MINOR: sample: converter: Add json_query converter

With the json_query can a JSON value be extacted from a Header
or body of the request and saved to a variable.

This converter makes it possible to handle some JSON Workload
to route requests to different backends.
---
 doc/configuration.txt  | 24 
 reg-tests/converter/json_query.vtc | 95 ++
 src/sample.c   | 88 +++
 3 files changed, 207 insertions(+)
 create mode 100644 reg-tests/converter/json_query.vtc

diff --git a/doc/configuration.txt b/doc/configuration.txt
index f242300e7..61c2a6dd9 100644
--- a/doc/configuration.txt
+++ b/doc/configuration.txt
@@ -15961,6 +15961,30 @@ json([])
   Output log:
  {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"}
 
+json_query(,[])
+  The json_query converter supports the JSON types string, boolean and
+  number. Floating point numbers will be returned as a string. By
+  specifying the output_type 'int' the value will be converted to an
+  Integer. If conversion is not possible the json_query converter fails.
+
+   must be a valid JSON Path string as defined in
+  https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/
+
+  Example:
+ # get a integer value from the request body
+ # "{"integer":4}" => 5
+ http-request set-var(txn.pay_int) req.body,json_query('$.integer','int'),add(1)
+
+ # get a key with '.' in the name
+ # {"my.key":"myvalue"} => myvalue
+ http-request set-var(txn.pay_mykey) req.body,json_query('$.my\\.key')
+
+ # {"boolean-false":false} => 0
+ http-request set-var(txn.pay_boolean_false) req.body,json_query('$.boolean-false')
+
+ # get the value of the key 'iss' from a JWT Bearer token
+ http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss')
+
 language([,])
   Returns the value with the highest q-factor from a list as extracted from the
   "accept-language" header using "req.fhdr". Values with no q-factor have a
diff --git a/reg-tests/converter/json_query.vtc b/reg-tests/converter/json_query.vtc
new file mode 100644
index 0..ade7b4ccb
--- /dev/null
+++ b/reg-tests/converter/json_query.vtc
@@ -0,0 +1,95 @@
+varnishtest "JSON Query converters Test"
+#REQUIRE_VERSION=2.4
+
+feature ignore_unknown_macro
+
+server s1 {
+	rxreq
+	txresp
+} -repeat 8 -start
+
+haproxy h1 -conf {
+defaults
+	mode http
+	timeout connect 1s
+	timeout client  1s
+	timeout server  1s
+	option http-buffer-request
+
+frontend fe
+	bind "fd@${fe}"
+	tcp-request inspect-delay 1s
+
+	http-request set-var(sess.header_json) req.hdr(Authorization),json_query('$.iss')
+	http-request set-var(sess.pay_json) req.body,json_query('$.iss')
+	http-request set-var(sess.pay_int) req.body,json_query('$.integer',"int"),add(1)
+	http-request set-var(sess.pay_neg_int) req.body,json_query('$.negativ-integer',"int"),add(1)
+	http-request set-var(sess.pay_double) req.body,json_query('$.double')
+	http-request set-var(sess.pay_boolean_true) req.body,json_query('$.boolean-true')
+	http-request set-var(sess.pay_boolean_false) req.body,json_query('$.boolean-false')
+	http-request set-var(sess.pay_mykey) req.body,json_query('$.my\\.key')
+
+	http-response set-header x-var_header %[var(sess.header_json)]
+	http-response set-header x-var_body %[var(sess.pay_json)]
+	http-response set-header x-var_body_int %[var(sess.pay_int)]
+	http-response set-header x-var_body_neg_int %[var(sess.pay_neg_int)]
+	http-response set-header x-var_body_double %[var(sess.pay_double)]
+	http-response set-header x-var_body_boolean_true %[var(sess.pay_boolean_true)]
+	http-response se

Re: Opentracing span name from variable

2021-04-15 Thread Miroslav Zagorac


On 04/15/2021 04:24 PM, Andrea Bonini wrote:

Hi all,

is it possible to use a variable as span name?
for example, can i use uri as span name so in zipkin it's visible as
service:

Thanks



Hello Andrea,

the name of the span is defined statically and cannot be the content of 
a variable.  However, you can add a tag, baggage or log to the span with 
whatever dynamic content you want.


Best regards,

--
Zaga

What can change the nature of a man?

Opentracing span name from variable

2021-04-15 Thread Andrea Bonini

Hi all,

is it possible to use a variable as span name?
for example, can i use uri as span name so in zipkin it's visible as
service: 

Thanks

-- 
Bonini Andrea

[PATCH] CI: travis-ci: enable graviton2 builds

2021-04-15 Thread Илья Шипицин

Hello,

as described in travis-ci docs:
https://blog.travis-ci.com/2020-09-11-arm-on-aws

it is next generation armv8 available on AWS.

cheers,
Ilya
From 4365c936396a5c7c4f710ed8267b06a73fd5bbcc Mon Sep 17 00:00:00 2001
From: Ilya Shipitsin 
Date: Thu, 15 Apr 2021 19:16:09 +0500
Subject: [PATCH] CI: travis-ci: enable weekly graviton2 builds

as documented in https://blog.travis-ci.com/2020-09-11-arm-on-aws
we can build on graviton2, let us expand travis-ci matrix to that
machine type as well
---
 .travis.yml | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index 2cb2d25a6..1aa415aa8 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -26,6 +26,12 @@ matrix:
 arch: arm64
 compiler: gcc
 if: type == cron
+  - os: linux
+arch: arm64-graviton2
+group: edge
+virt: vm
+compiler: gcc
+if: type == cron
   - os: linux
 arch: s390x
 compiler: gcc
-- 
2.30.2

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 04:05:27PM +0200, Aleksandar Lazic wrote:
> Well I don't think so because 4 is still bigger then -9007199254740991 ;-)

This is because *you* think it is -9007199254740991 but the reality
is that it's not this.due to ULL:

  #define JSON_INT_MAX ((1ULL << 53) - 1)
  #define JSON_INT_MIN (-JSON_INT_MAX)

=> it's -9007199254740991ULL hence 18437736874454810625 so 4 is
   definitely not larger than this.

> Never the less I have changed the defines and rerun the tests.
> Btw, this vtest is a great enhancement to haproxy ;-)

Yes I totally agree. And you can't imagine how many times I'm angry
at it when it detects an error after a tiny change I make, just to
realize that I did really break something and that it was right :-)
Like all tools it just needs to be reasonably used, not excessively
trusted but used as a good hint that something unexpected changed,
and it helps a lot!

> ```
> #define JSON_INT_MAX ((1ULL << 53) - 1)
  ^
Sorry I was not clear, please drop that 'U' here.

Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 15:55, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 03:41:18PM +0200, Aleksandar Lazic wrote:

Now when I remove the check "smp->data.u.sint < 0" every positive value is
bigger then JSON INT_MIN and returns 0.

But don't you agree that this test DOES nothing ? If it changes anything
it means the issue is somewhere else and is a hidden bug waiting to strike
and that we must address it.

Look:

  if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

is exactly equivalent to:

  if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)

JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
there is no value of sint that validates 

I think it checks if the value is negative or positive and then verify if the
value is bigger then the the max allowed value, +/-.

Maybe I thing wrong, so let us work with concrete values.

```
printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: if-no-check:%d:<<\n",   
smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < JSON_INT_MIN));
printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: if-no-check:%d:<<\n\n", 
smp->data.u.sint,JSON_INT_MAX,(smp->data.u.sint > JSON_INT_MAX));

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
return 0;
else if (smp->data.u.sint > 0 && smp->data.u.sint > JSON_INT_MAX)
return 0;

```

Input is here 4.

smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991:  if-no-check:0:<<

Input is here -4.

smp->data.u.sint :-4: < JSON_INT_MIN :-9007199254740991: if-no-check:0:<<
smp->data.u.sint :-4: > JSON_INT_MAX :9007199254740991:  if-no-check:1:<<

OK I think I got it. It's just because your definitions of JSON_INT_MIN
and JSON_INT_MAX are unsigned and the comparison is made in unsigned mode.
So when you do "4 < JSON_INT_MIN" it's in fact "4 < 2^64-(1<53)-1" so it's
true. And conversely for the other one.

I'm pretty sure that if you change your constants to:

   #define JSON_INT_MAX ((1LL << 53) - 1)
   #define JSON_INT_MIN (-JSON_INT_MAX)

It will work :-)

Well I don't think so because 4 is still bigger then -9007199254740991 ;-)

Never the less I have changed the defines and rerun the tests.
Btw, this vtest is a great enhancement to haproxy ;-)

```
#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (-JSON_INT_MAX)

printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: if-no-check:%d:<<\n", 
smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < JSON_INT_MIN));
printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: if-no-check:%d:<<\n\n", 
smp->data.u.sint,JSON_INT_MAX, (smp->data.u.sint > JSON_INT_MAX));

if (smp->data.u.sint < JSON_INT_MIN)
return 0;
else if (smp->data.u.sint > JSON_INT_MAX)
return 0;
```

>> smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
>> smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991: if-no-check:0:<<

That's among the historical idiocies of the C language that considers
the signedness as part of the variable instead of being the mode of the
operation applied to the variable. This results in absurd combinations.

Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 03:41:18PM +0200, Aleksandar Lazic wrote:
> > > Now when I remove the check "smp->data.u.sint < 0" every positive value is
> > > bigger then JSON INT_MIN and returns 0.
> > 
> > But don't you agree that this test DOES nothing ? If it changes anything
> > it means the issue is somewhere else and is a hidden bug waiting to strike
> > and that we must address it.
> > 
> > Look:
> > 
> >  if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
> > 
> > is exactly equivalent to:
> > 
> >  if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)
> > 
> > JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
> > there is no value of sint that validates  > <0.
> 
> I think it checks if the value is negative or positive and then verify if the
> value is bigger then the the max allowed value, +/-.
> 
> Maybe I thing wrong, so let us work with concrete values.
> 
> ```
> printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: 
> if-no-check:%d:<<\n",   smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < 
> JSON_INT_MIN));
> printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: 
> if-no-check:%d:<<\n\n", smp->data.u.sint,JSON_INT_MAX,(smp->data.u.sint > 
> JSON_INT_MAX));
> 
> if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
>   return 0;
> else if (smp->data.u.sint > 0 && smp->data.u.sint > JSON_INT_MAX)
>   return 0;
> 
> ```
> 
> Input is here 4.
> >> smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
> >> smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991:  if-no-check:0:<<
> 
> Input is here -4.
> 
> >> smp->data.u.sint :-4: < JSON_INT_MIN :-9007199254740991: if-no-check:0:<<
> >> smp->data.u.sint :-4: > JSON_INT_MAX :9007199254740991:  if-no-check:1:<<

OK I think I got it. It's just because your definitions of JSON_INT_MIN
and JSON_INT_MAX are unsigned and the comparison is made in unsigned mode.
So when you do "4 < JSON_INT_MIN" it's in fact "4 < 2^64-(1<53)-1" so it's
true. And conversely for the other one.

I'm pretty sure that if you change your constants to:

  #define JSON_INT_MAX ((1LL << 53) - 1)
  #define JSON_INT_MIN (-JSON_INT_MAX)

It will work :-)

That's among the historical idiocies of the C language that considers
the signedness as part of the variable instead of being the mode of the
operation applied to the variable. This results in absurd combinations.

Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic

On 15.04.21 14:48, Willy Tarreau wrote:

On Thu, Apr 15, 2021 at 02:17:45PM +0200, Aleksandar Lazic wrote:

I, by far, prefer Tim's proposal here, as I do not even understand the
first one, sorry Aleks, please don't feel offended :-)

Well you know my focus is to support HAProxy and therefore it's okay.
The contribution was in the past much easier, but you know time changes.

It's not getting harder, we've always had numerous round trips,
however now there are more people participating and it's getting
increasingly difficult to maintain a constant level of quality so
it is important to take care about maintainability, which implies
being carefull about the coding style (which is really not strict)
and a good level of English in the doc (which remains achievable
as most of the contributors are not native speakers so we're not
using advanced English). In addition there's nothing wrong with
saying "I need someone to reword this part where I don't feel at
ease", it's just that nobody will force it on you as it would not
be kind nor respectful of your work.

In fact I'd say that it's got easier because most of the requirements
have been formalized by now, or are not unique to this project but
shared with other ones.

Okay, got you.

  From my point of view is it necessary to check if the value is a negative
value and only then should be checked if the max '-' range is reached.

But the first one is implied by the second. It looks like a logical
error when read like this, it makes one think the author had something
different in mind. It's like writing "if (a < 0 && a < -2)". It is
particularly confusing.

Well then then this does not work anymore

If so it precisely shows that a problem remains somewhere else.

Hm, maybe.

http-request set-var(sess.pay_int) req.body,json_query('$.integer',"int"),add(1)

with the given defines.

#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (0 - JSON_INT_MAX)

Because "{"integer":4}" => 5" and 5 is bigger then JSON_INT_MIN which is 
(0-JSON_INT_MAX)

This sequence works because I check if the value is negative "smp->data.u.sint < 
0"
and only then check if the negative max border "JSON_INT_MIN"  is reached.

I'm sorry but I don't get it.

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

The same belongs to the positive max int.

Now when I remove the check "smp->data.u.sint < 0" every positive value is
bigger then JSON INT_MIN and returns 0.

But don't you agree that this test DOES nothing ? If it changes anything
it means the issue is somewhere else and is a hidden bug waiting to strike
and that we must address it.

Look:

 if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

is exactly equivalent to:

 if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)

JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
there is no value of sint that validates 

I think it checks if the value is negative or positive and then verify if the
value is bigger then the the max allowed value, +/-.

Maybe I thing wrong, so let us work with concrete values.

```
printf("\n\n>> smp->data.u.sint :%lld: < JSON_INT_MIN :%lld: if-no-check:%d:<<\n",   
smp->data.u.sint,JSON_INT_MIN,(smp->data.u.sint < JSON_INT_MIN));
printf(">> smp->data.u.sint :%lld: > JSON_INT_MAX :%lld: if-no-check:%d:<<\n\n", 
smp->data.u.sint,JSON_INT_MAX,(smp->data.u.sint > JSON_INT_MAX));

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
return 0;
else if (smp->data.u.sint > 0 && smp->data.u.sint > JSON_INT_MAX)
return 0;

```

Input is here 4.
>> smp->data.u.sint :4: < JSON_INT_MIN :-9007199254740991: if-no-check:1:<<
>> smp->data.u.sint :4: > JSON_INT_MAX :9007199254740991:  if-no-check:0:<<

Input is here -4.

>> smp->data.u.sint :-4: < JSON_INT_MIN :-9007199254740991: if-no-check:0:<<
>> smp->data.u.sint :-4: > JSON_INT_MAX :9007199254740991:  if-no-check:1:<<

This looks to me when the comparing is done with a positive value then it
will be true for the JSON_INT_MIN and when the comparing is done with a negative
value then will it be true for JSON_INT_MAX.

So the concrete question is how to check the value in the positive and negative
range without the "smp->data.u.sint < 0" or "smp->data.u.sint > 0".

I haven't found any other solution, I'm open for any suggestion?

Willy

Regards
Aleks

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 02:17:45PM +0200, Aleksandar Lazic wrote:
> > I, by far, prefer Tim's proposal here, as I do not even understand the
> > first one, sorry Aleks, please don't feel offended :-)
> 
> Well you know my focus is to support HAProxy and therefore it's okay.
> The contribution was in the past much easier, but you know time changes.

It's not getting harder, we've always had numerous round trips,
however now there are more people participating and it's getting
increasingly difficult to maintain a constant level of quality so
it is important to take care about maintainability, which implies
being carefull about the coding style (which is really not strict)
and a good level of English in the doc (which remains achievable
as most of the contributors are not native speakers so we're not
using advanced English). In addition there's nothing wrong with
saying "I need someone to reword this part where I don't feel at
ease", it's just that nobody will force it on you as it would not
be kind nor respectful of your work.

In fact I'd say that it's got easier because most of the requirements
have been formalized by now, or are not unique to this project but
shared with other ones.

> > >  From my point of view is it necessary to check if the value is a negative
> > > value and only then should be checked if the max '-' range is reached.
> > 
> > But the first one is implied by the second. It looks like a logical
> > error when read like this, it makes one think the author had something
> > different in mind. It's like writing "if (a < 0 && a < -2)". It is
> > particularly confusing.
> 
> Well then then this does not work anymore

If so it precisely shows that a problem remains somewhere else.

> http-request set-var(sess.pay_int) 
> req.body,json_query('$.integer',"int"),add(1)
> 
> with the given defines.
> 
> #define JSON_INT_MAX ((1ULL << 53) - 1)
> #define JSON_INT_MIN (0 - JSON_INT_MAX)
> 
> Because "{"integer":4}" => 5" and 5 is bigger then JSON_INT_MIN which is 
> (0-JSON_INT_MAX)
> 
> This sequence works because I check if the value is negative 
> "smp->data.u.sint < 0"
> and only then check if the negative max border "JSON_INT_MIN"  is reached.

I'm sorry but I don't get it.

> if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)
> 
> The same belongs to the positive max int.
> 
> Now when I remove the check "smp->data.u.sint < 0" every positive value is
> bigger then JSON INT_MIN and returns 0.

But don't you agree that this test DOES nothing ? If it changes anything
it means the issue is somewhere else and is a hidden bug waiting to strike
and that we must address it.

Look:

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

is exactly equivalent to:

if (smp->data.u.sint < JSON_INT_MIN && smp->data.u.sint < 0)

JSON_INT_MIN < 0 so the first part implies the second one. Said differently,
there is no value of sint that validates

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Aleksandar Lazic


On 15.04.21 09:08, Willy Tarreau wrote:

On Wed, Apr 14, 2021 at 09:52:31PM +0200, Aleksandar Lazic wrote:

+   - string  : This is the default search type and returns a String;
+   - boolean : If the JSON value is not a String or a Number
+   - number  : When the JSON value is a Number then will the value be
+   converted to a String. If its known that the value is a
+   integer then add 'int' to the  which helps
+   haproxy to convert the value to a integer for further usage;


I'd probably completely rephrase this as:

The json_query converter supports the JSON types string, boolean and
number. Floating point numbers will be returned as a string. By specifying
the output_type 'int' the value will be converted to an Integer. If
conversion is not possible the json_query converter fails.


Well I would like to here also some other opinions about the wording.


I, by far, prefer Tim's proposal here, as I do not even understand the
first one, sorry Aleks, please don't feel offended :-)


Well you know my focus is to support HAProxy and therefore it's okay.
The contribution was in the past much easier, but you know time changes.


+    switch(tok) {
+    case MJSON_TOK_NUMBER:
+    if (args[1].type == ARGT_SINT) {
+    smp->data.u.sint = atoll(p);
+
+    if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN) {
+    /* JSON integer too big negativ value */


This comment appears to be useless. It is implied by the 'if'. I also believe that 
the 'sint < 0' check is not needed.


Well I prefer to document in the comment what the if is doing.


OK but then please be careful about spelling, or it will force Ilya to
send yet another spell-checker patch.


 From my point of view is it necessary to check if the value is a negative
value and only then should be checked if the max '-' range is reached.


But the first one is implied by the second. It looks like a logical
error when read like this, it makes one think the author had something
different in mind. It's like writing "if (a < 0 && a < -2)". It is
particularly confusing.


Well then then this does not work anymore

http-request set-var(sess.pay_int) req.body,json_query('$.integer',"int"),add(1)

with the given defines.

#define JSON_INT_MAX ((1ULL << 53) - 1)
#define JSON_INT_MIN (0 - JSON_INT_MAX)

Because "{"integer":4}" => 5" and 5 is bigger then JSON_INT_MIN which is 
(0-JSON_INT_MAX)

This sequence works because I check if the value is negative "smp->data.u.sint < 
0"
and only then check if the negative max border "JSON_INT_MIN"  is reached.

if (smp->data.u.sint < 0 && smp->data.u.sint < JSON_INT_MIN)

The same belongs to the positive max int.

Now when I remove the check "smp->data.u.sint < 0" every positive value is 
bigger then JSON INT_MIN
and returns 0.

How about to add this information into the comments?


Maybe there is a better solution, I'm open for suggestions.

I can move the comment above the 'if'.


You have the choice as long as it's clear:
   - above the if, you describe what you're testing and why
   - inside the if, you describe the condition you've validated.

As it is now, it's best inside the if.

Thanks!
Willy

Re: [OT PATCH 0/2] opentracing: use of the context name without prefix

2021-04-15 Thread Andrea Bonini

Great work,

i testet it right now and work like a charm!

Thank u Zaga, good work!

Il giorno gio 15 apr 2021 alle ore 08:40 Willy Tarreau  ha
scritto:

> On Wed, Apr 14, 2021 at 12:56:39PM +0200, Miroslav Zagorac wrote:
> > Hello all,
> >
> > the next two patches allow you to transfer the OpenTracing context via an
> > HTTP headers without using a name prefix.
> >
> > This is very useful when transferring context from or to some other
> process
> > that cannot add or remove a prefix from that data.
>
> Merged, thanks Miroslav!
> Willy
>


-- 
Bonini Andrea

Re: changed IP messages overrunning /var/log ?

2021-04-15 Thread Jarno Huuskonen

Hello,

On Thu, 2021-04-15 at 01:43 -0600, Jim Freeman wrote:
> This is puzzling, since haproxy.cfg directs all logs to local*
> After some investigation, it turns out that the daemon.log and syslog
> entries arrive via facility.level=daemon.info.  I've made rsyslog cfg
> changes that now stop the haproxy msgs from overrunning daemon.log and
> syslog (and allow only a representative fraction to hit haproxy.log).
> 
> Two questions :
>  1) What is different about 2.0 that "changed its IP" entries are so
> voluminous ?
>  2) Why is daemon.info involved in the logging, when the haproxy.cfg
> settings only designate local* facilities ?

Are you running haproxy as systemd service ? Those logs could be
coming from systemd (haproxy stdout/stderr).

-Jarno

-- 
Jarno Huuskonen

changed IP messages overrunning /var/log ?

2021-04-15 Thread Jim Freeman

Migrating from stock stretch-backports+1.8 to Debian_10/Buster+2.0 (to
purge 'reqrep' en route to 2.2), /var/log/ is suddenly filling disk
with messages about changed IPs :
===
2021-04-14T01:08:40.123303+00:00 ip-10-36-217-169 haproxy[569]:
my_web_service changed its IP from 52.208.198.117 to 34.251.174.55 by
DNS cache.
===
daemon.log and syslog (plus the usual haproxy.log) all get hammered.

Many of the backends (200+) are of the flavor :
server-template my_web_service 8 my_web_service.herokuapp.com:443 ...
resolvers fs_resolvers resolve-prefer ipv4

The herokuapp.com addresses change constantly, but this has not been a
problem heretofore.

This is puzzling, since haproxy.cfg directs all logs to local*
After some investigation, it turns out that the daemon.log and syslog
entries arrive via facility.level=daemon.info.  I've made rsyslog cfg
changes that now stop the haproxy msgs from overrunning daemon.log and
syslog (and allow only a representative fraction to hit haproxy.log).

Two questions :
 1) What is different about 2.0 that "changed its IP" entries are so
voluminous ?
 2) Why is daemon.info involved in the logging, when the haproxy.cfg
settings only designate local* facilities ?

Thanks for any insights (and for stupendous software) !

Running HAProxy 2.0 from :
https://haproxy.debian.net/#?distribution=Debian&release=buster&version=2.0

on stock Buster AWS AMI :
https://wiki.debian.org/Cloud/AmazonEC2Image
https://wiki.debian.org/Cloud/AmazonEC2Image

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 07:13:53AM +, Robin H. Johnson wrote:
> Thanks; I will need to catch it faster or automate this, because the
> watchdog does a MUCH better job restarting it than before, less than 30
> seconds of 100% CPU before the watchdog reliably kills it.

I see. Then collecting the watchdog outputs can be instructive to see
if it always happens at the same place or not. And the core dumps will
indicate what all threads were doing (and if some were competing on a
lock for example).

> >   - please also check if your machine is not swapping, as this can have
> > terrible consequences and could explain why it only happens on certain
> > machines dealing with certain workloads. I remember having spent several
> > weeks many years ago chasing a response time issue happening only in the
> > morning, which was in fact caused by the log upload batch having swapped
> > and left a good part of the unused memory in the swap, making it very
> > difficult for the network stack to allocate buffers during send() and
> > recv(), thus causing losses and retransmits as the load grew. This was
> > never reproducible in the lab because we were not syncing logs :-)
> 512GiB RAM and no swap configured on the system at all.

:-)

> Varnish runs on the same host and is used to cache some of the backends.
> Please of free memory at the moment.

I'm now thinking about something. Do you have at least as many CPUs as the
total number of threads used by haproxy and varnish ? Otherwise there will
be some competition and migrations will happen. If neither is bounded, you
can even end up with two haproxy threads forced to run on the same CPU,
which is the worst situation as one could be scheduled out with a lock
held and the other one spinning waiting for this lock.

With half a TB of RAM I guess you have multiple sockets. Could you at
least pin haproxy to the CPUs of a single socket (that's the bare minimum
to do to preserve performance, as atomics and locks over UPI/QPI are a
disaster), and ideally pin varnish to another socket ?

Or maybe just enable less threads for haproxy if you don't need that many
and make sure the CPUs it's bound to are not used by varnish ?

In such a setup combining several high-performance processes, it's really
important to reserve the resources to them, and you must count on the
number of CPUs needed to deal with network interrupts as well (and likely
for disk if varnish uses it). Once your resources are cleanly reserved,
you'll get the maximum performance with the lowest latency.

Willy

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Robin H. Johnson

On Thu, Apr 15, 2021 at 08:59:35AM +0200, Willy Tarreau wrote:
> On Wed, Apr 14, 2021 at 01:53:06PM +0200, Christopher Faulet wrote:
> > > nbthread=64, nbproc=1 on both 1.8/2.x
> > 
> > It is thus surprising, if it is really a contention issue, that you never
> > observed slow down on the 1.8. There is no watchdog, but the thread
> > implementation is a bit awkward on the 1.8. 2.X are better on this point,
> > the best being the 2.4.
> 
> Agreed, I'd even say that 64 threads in 1.8 should be wa slower than
> a single thread.
> 
> A few things are worth having a look at, Robin:
>   - please run "perf top" when this happens, and when you see a function
> reporting a high usage, do no hesitate to navigate through it, pressing
> enter, and "annotate function ". Then scrolling through it will
> reveal some percentage of time certain points were met. It's very possible
> you'll find that 100% of the CPU are used in 1, 2 or 3 functions and
> that there is a logic error somewhere. Usually if you find a single one
> you'll end up spotting a lock.
Thanks; I will need to catch it faster or automate this, because the
watchdog does a MUCH better job restarting it than before, less than 30
seconds of 100% CPU before the watchdog reliably kills it.

>   - please also check if your machine is not swapping, as this can have
> terrible consequences and could explain why it only happens on certain
> machines dealing with certain workloads. I remember having spent several
> weeks many years ago chasing a response time issue happening only in the
> morning, which was in fact caused by the log upload batch having swapped
> and left a good part of the unused memory in the swap, making it very
> difficult for the network stack to allocate buffers during send() and
> recv(), thus causing losses and retransmits as the load grew. This was
> never reproducible in the lab because we were not syncing logs :-)
512GiB RAM and no swap configured on the system at all.
Varnish runs on the same host and is used to cache some of the backends.
Please of free memory at the moment.

-- 
Robin Hugh Johnson
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85


signature.asc
Description: PGP signature

Re: Is it possible: HTTP frontend connecting HTTPS endpoint using Client Certificate Authentication behind Squid

2021-04-15 Thread Willy Tarreau

On Thu, Apr 15, 2021 at 09:02:18AM +0200, Ferry Bodijn wrote:
> L.S.
> 
> Is it possible with HAProxy, to have a frontend which binds on HTTP that
> refers to a backend that connects to an HTTPS endpoint using Client
> Certificate Authentication while reaching it via a Squid forwarding proxy?

If this requires to send a CONNECT request to Squid, then no as we have
not implemented a CONNECT encapsulation for outgoing connections yet. But
if you just need to connect to Squid using HTTPS and let it deal with the
request, then you should have no issue.

Note that I'm pretty sure that Squid supports listening to incoming SSL
connections, so even if that's not what you're doing, it probably is the
right way to proceed.

Hoping this helps,
Willy

Re: [PATCH] MINOR: sample: add json_string

2021-04-15 Thread Willy Tarreau

On Wed, Apr 14, 2021 at 09:52:31PM +0200, Aleksandar Lazic wrote:
> > > +   - string  : This is the default search type and returns a String;
> > > +   - boolean : If the JSON value is not a String or a Number
> > > +   - number  : When the JSON value is a Number then will the value be
> > > +   converted to a String. If its known that the value is a
> > > +   integer then add 'int' to the  which helps
> > > +   haproxy to convert the value to a integer for further 
> > > usage;
> > 
> > I'd probably completely rephrase this as:
> > 
> > The json_query converter supports the JSON types string, boolean and
> > number. Floating point numbers will be returned as a string. By specifying
> > the output_type 'int' the value will be converted to an Integer. If
> > conversion is not possible the json_query converter fails.
> 
> Well I would like to here also some other opinions about the wording.

I, by far, prefer Tim's proposal here, as I do not even understand the
first one, sorry Aleks, please don't feel offended :-)

> > > +  Example:
> > > + # get the value of the key 'iss' from a JWT Bearer token
> > > + http-request set-var(txn.token_payload) 
> > > req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss')
> > > +
> > > + # get a integer value from the request body
> > > + # "{"integer":4}" => 5
> > > + http-request set-var(txn.pay_int) 
> > > req.body,json_query('$.integer','int'),add(1)
> > > +
> > > + # get a key with '.' in the name
> > > + # {"my.key":"myvalue"} => myvalue
> > > + http-request set-var(txn.pay_mykey) 
> > > req.body,json_query('$.my\\.key')
> > > +
> > > + # {"boolean-false":false} => 0
> > > + http-request set-var(txn.pay_boolean_false) 
> > > req.body,json_query('$.boolean-false')
> > 
> > These examples look good to me. I'd just move the JWT example to the 
> > bottom, so that the simple examples come first.
> 
> I prefer to keep it like this.

I would also have preferred to start with the simpler ones but that's
not important as long as there are both simple ones and advanced ones.

> > > +    switch(tok) {
> > > +    case MJSON_TOK_NUMBER:
> > > +    if (args[1].type == ARGT_SINT) {
> > > +    smp->data.u.sint = atoll(p);
> > > +
> > > +    if (smp->data.u.sint < 0 && smp->data.u.sint < 
> > > JSON_INT_MIN) {
> > > +    /* JSON integer too big negativ value */
> > 
> > This comment appears to be useless. It is implied by the 'if'. I also 
> > believe that the 'sint < 0' check is not needed.
> 
> Well I prefer to document in the comment what the if is doing.

OK but then please be careful about spelling, or it will force Ilya to
send yet another spell-checker patch.

> From my point of view is it necessary to check if the value is a negative
> value and only then should be checked if the max '-' range is reached.

But the first one is implied by the second. It looks like a logical
error when read like this, it makes one think the author had something
different in mind. It's like writing "if (a < 0 && a < -2)". It is
particularly confusing.

> Maybe there is a better solution, I'm open for suggestions.
> 
> I can move the comment above the 'if'.

You have the choice as long as it's clear:
  - above the if, you describe what you're testing and why
  - inside the if, you describe the condition you've validated.

As it is now, it's best inside the if.

Thanks!
Willy

Is it possible: HTTP frontend connecting HTTPS endpoint using Client Certificate Authentication behind Squid

2021-04-15 Thread Ferry Bodijn

L.S.

Is it possible with HAProxy, to have a frontend which binds on HTTP that
refers to a backend that connects to an HTTPS endpoint using Client
Certificate Authentication while reaching it via a Squid forwarding proxy?

I have a working Apache HTTP server config which I want to replace because
Apache doesn't support OpenTracing.

Apache config:

ProxyRemote "*" "http://squid-internet:3128";

  ServerName onloading-proxy-internet-endpoint
  ..
  SSLProxyMachineCertificateFile keypair-internet.pem
  SSLProxyMachineCertificateChainFile chain-internet.pem
  ProxyPass / https://internet-endpoint
  ProxyPassReverse / https://internet-endpoint


Regards

Ferry

Re: Still 100% CPU usage in 2.3.9 & 2.2.13 (Was: Re: [2.2.9] 100% CPU usage)

2021-04-15 Thread Willy Tarreau

On Wed, Apr 14, 2021 at 01:53:06PM +0200, Christopher Faulet wrote:
> > nbthread=64, nbproc=1 on both 1.8/2.x
> 
> It is thus surprising, if it is really a contention issue, that you never
> observed slow down on the 1.8. There is no watchdog, but the thread
> implementation is a bit awkward on the 1.8. 2.X are better on this point,
> the best being the 2.4.

Agreed, I'd even say that 64 threads in 1.8 should be wa slower than
a single thread.

A few things are worth having a look at, Robin:
  - please run "perf top" when this happens, and when you see a function
reporting a high usage, do no hesitate to navigate through it, pressing
enter, and "annotate function ". Then scrolling through it will
reveal some percentage of time certain points were met. It's very possible
you'll find that 100% of the CPU are used in 1, 2 or 3 functions and
that there is a logic error somewhere. Usually if you find a single one
you'll end up spotting a lock.

  - please also check if your machine is not swapping, as this can have
terrible consequences and could explain why it only happens on certain
machines dealing with certain workloads. I remember having spent several
weeks many years ago chasing a response time issue happening only in the
morning, which was in fact caused by the log upload batch having swapped
and left a good part of the unused memory in the swap, making it very
difficult for the network stack to allocate buffers during send() and
recv(), thus causing losses and retransmits as the load grew. This was
never reproducible in the lab because we were not syncing logs :-)

Willy

38 matches

Mail list logo