Re: mworker: execvp failure depending on argv[0]
On Tue, Jan 09, 2018 at 11:31:51PM +0100, William Lallemand wrote: > From ce9920d284e55600ef324a322a3aed92dd2af02f Mon Sep 17 00:00:00 2001 > From: William Lallemand > Date: Tue, 9 Jan 2018 23:12:27 +0100 > Subject: [PATCH] BUG/MEDIUM: mworker: execvp failure depending on argv[0] > > The copy_argv() function lacks a check on '-' to remove the -x, -sf and > -st parameters. > > When reloading a master process with a path starting by /st, /sf, or > /x.. the copy_argv() function skipped argv[0] leading to an execvp() > without the binary. Wow, I love these ones :-) Now merged in 1.9 and 1.8, thanks! Willy
mworker: execvp failure depending on argv[0]
That one can really make you crazy if you are in this exact case :-) -- William Lallemand >From ce9920d284e55600ef324a322a3aed92dd2af02f Mon Sep 17 00:00:00 2001 From: William Lallemand Date: Tue, 9 Jan 2018 23:12:27 +0100 Subject: [PATCH] BUG/MEDIUM: mworker: execvp failure depending on argv[0] The copy_argv() function lacks a check on '-' to remove the -x, -sf and -st parameters. When reloading a master process with a path starting by /st, /sf, or /x.. the copy_argv() function skipped argv[0] leading to an execvp() without the binary. --- src/haproxy.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/haproxy.c b/src/haproxy.c index e98420e2..20b18f85 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -1242,7 +1242,8 @@ static char **copy_argv(int argc, char **argv) while (i < argc) { /* -sf or -st or -x */ - if ((argv[i][1] == 's' && (argv[i][2] == 'f' || argv[i][2] == 't')) || argv[i][1] == 'x' ) { + if (i > 0 && argv[i][0] == '-' && + ((argv[i][1] == 's' && (argv[i][2] == 'f' || argv[i][2] == 't')) || argv[i][1] == 'x' )) { /* list of pids to finish ('f') or terminate ('t') or unix socket (-x) */ i++; while (i < argc && argv[i][0] != '-') { -- 2.13.6
Re: [PATCH 1/2] BUG/MINOR: lua: Fix default value for pattern in Socket.receive
Willy, Am 09.01.2018 um 20:07 schrieb Willy Tarreau: > Indeed, we try to perform backports by batches because it requires a > bit of concentration. Understood. > That's why it's important for the stable team > that the commits are properly marked for backports, as you did. So > thanks for that ;-) It pays to read the CONTRIBUTING file :-) Best regards Tim Düsterhus
Re: [PATCH 1/2] BUG/MINOR: lua: Fix default value for pattern in Socket.receive
On Tue, Jan 09, 2018 at 06:55:59PM +0100, Tim Düsterhus wrote: > Willy, I notice that you did not backport to earlier than 1.8, yet. Do > you usually do this shortly before release or did you forget? At least > the two MINOR ones should be backported to 1.6 also. Indeed, we try to perform backports by batches because it requires a bit of concentration. That's why it's important for the stable team that the commits are properly marked for backports, as you did. So thanks for that ;-) Willy
Cache & ACLs issue
I'm experimenting the small objects cache feature in 1.8, maybe I'm doing something obviously wrong, but I don't get what... Here is my setup: (...) cache static_assets total-max-size 100 max-age 60 (...) frontend fe_main # HTTP(S) Service bind *:80 name http acl cached_service-acl hdr_dom(host) -i cached_service.localdomain use_backend be_cached_service if cached_service backend be_cached_service acl static_cached_paths path_end -i my/resource/path http-request cache-use static_assets if static_cached_paths http-response cache-store static_assets server srv0 127.0.0.1:8000 In that case I can request on /my/resource/path and I'll have something stored in the cache: $ $ curl -v -L http://127.0.0.1/my/resource/path -H "Host: cached_service.localdomain" (...) < HTTP/1.1 200 OK < Content-Type: application/json < Date: Tue, 09 Jan 2018 18:32:05 GMT < Content-Length: 14 < [ "tmp" ] $ echo "show cache static_assets" | sudo socat stdio /var/lib/haproxy/stats 0x7fbfa94df03a: static_assets (shctx:0x7fbfa94df000, available blocks:102400) 0x7fbfa94e11ac hash:3952565486 size:190 (1 blocks), refcount:0, expire:54 But, if I request something else, it currently override the first cached asset and will be served then... $ curl -v -L http://127.0.0.1/my/other/resource/path -H "Host: cached_service.localdomain" (...) < HTTP/1.1 200 OK < Content-Type: application/json < X-Consul-Index: 77 < X-Consul-Knownleader: true < X-Consul-Lastcontact: 0 < Date: Tue, 09 Jan 2018 18:33:30 GMT < Content-Length: 547 < { "something_more_verbose(...)" } $ echo "show cache" | sudo socat stdio /var/lib/haproxy/stats 0x7f4c5ea5a03a: static_assets (shctx:0x7f4c5ea5a000, available blocks:102400) 0x7f4c5ea5a4cc hash:3952565486 size:797 (1 blocks), refcount:0, expire:55 The entry has been flushed and replaced by the new one, independently from the expiration state. In that case it's consul that answer, so it explains these X-Consul headers for the 2nd response. Does it ring a bell to someone ? Thanks, Pierre signature.asc Description: OpenPGP digital signature
Re: [PATCH 1/2] BUG/MINOR: lua: Fix default value for pattern in Socket.receive
Hi Am 09.01.2018 um 15:24 schrieb Willy Tarreau: > On Mon, Jan 08, 2018 at 11:35:47AM +0100, Thierry Fournier wrote: >> Thanks for the patch. Good catch ! >> Willy, you can apply it. > > OK thanks for the review, all 4 patches applied now. > Thank you both. Willy, I notice that you did not backport to earlier than 1.8, yet. Do you usually do this shortly before release or did you forget? At least the two MINOR ones should be backported to 1.6 also. Best regards Tim Düsterhus
Re: cannot bind socket - Need help with config file
Hi Lukus, thanks again for your continued help and support! Here is my config file with updates now: frontend main bind :2200 default_backend sftp timeout client 5d listen stats bind *:2200 mode tcp maxconn 2000 option redis-check retries 3 option redispatch balance roundrobin Please correct me if you see something that is not right. You asked about my SSH/SFTP use-case. Basically, here is my use-case. I have several SFTP servers that I would like to load-balance. I was thinking about using HAProxy to load-balance SFTP connections between my SFTP servers. As I was testing my setup yesterday, I was sending sftp file transfers to the HAproxy node, I noticed that HAProxy node CPU usage was pretty high. I am beginning to wonder if it is the right setup for my environment. Is HAProxy is the right solution for SFTP server load-balancing? thanks On Tue, Jan 9, 2018 at 2:12 AM, Lukas Tribus wrote: > Hello Imam, > > > On Tue, Jan 9, 2018 at 2:30 AM, Imam Toufique wrote: > > > > Hi Jonathan, and Lucas, > > > > Thanks for your replies. With your help, I was able to get it work > > partially. > > Please always CC the mailing list though. > > > > > frontend main *:2200 > >#bind *:22 > >default_backend sftp > >timeout client 1h > > While this works, it's causing a lot of confusion. Please do follow my > advice and DON'T specify the port in the frontend/listen line. Use the > bind directive instead. > So in this case: > > > frontend main > >bind :2200 > >default_backend sftp > >timeout client 1h > > It's much more readable like this. > > > > > listen stats > > #bind *:22 > > You disbled your stats section with this configuration. Either decide > for a port, or remove it if you don't need it. > > > > > But haproxy starts and I was able to get ssh to one of the servers. Now > I > > have a different problem where I get a ssh ket fingerprint error warning > and > > my connection drops. > > > > I get the error below: > > > > [vagrant@db ~]$ ssh file -p 2200 > > @@@ > > @WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ > > @@@ > > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > > Someone could be eavesdropping on you right now (man-in-the-middle > attack)! > > It is also possible that a host key has just been changed. > > The fingerprint for the RSA key sent by the remote host is > > SHA256:MHkXThp4cSltDn0/mRsq7Se+qcDz6cz1dD+kCiyE9e0. > > Please contact your system administrator. > > Add correct host key in /home/vagrant/.ssh/known_hosts to get rid of this > > message. > > Offending ECDSA key in /home/vagrant/.ssh/known_hosts:4 > > RSA host key for [file]:2200 has changed and you have requested strict > > checking. > > Host key verification failed > > > > It looks like host keys are changing, and the host key becomes unknown to > > both servers that are behind HAProxy. what do you recommend doing in a > case > > like this? > > That's what happens when you load-balance between 2 different SSH > servers with a different private key. What is it that you want to > achieve in the first place? > > > > cheers, > lukas > -- Regards, *Imam Toufique* *213-700-5485*
Re: haproxy+QAT memory usage very high under busy traffic
Hi Julian, On 01/09/2018 03:28 PM, Willy Tarreau wrote: > Hi Julian, > > On Tue, Jan 09, 2018 at 08:50:48AM +, Julian Zhu wrote: >> We are testing haproxy+QAT card(Intel QuickAssist-Technology) and find that >> the memory usage of haproxy+QAT is much higher than that of haproxy alone. >> Under the same traffic (about 120K connections), haproxy alone only takes 6G >> memory while haproxy+QAT takes 36G. >> The only difference in config is as below for haproxy+QAT: >> .. >> ssl-engine qat >> ssl-mode-async >> .. >> and the memory is not released even if we terminate all haproxy processes >> until we reboot the server. > > Then this indicates that the leak (if any but it sounds like this) is in > the engine. You'd need to unload and reload the modules to see if it's a > real leak or just some extreme memory usage. > >> OS: >> Centos-7 >> >> SW version: >> haproxy-1.8.3 >> openssl-1.1.0e >> QAT_Engine: 0.5.20 >> QAT1.7.Upstream.L.1.0.3_42 >> >> Does anyone encounter the same issue? Is there any way to debug or fix it? > > Unfortunately at this point it will be entirely dependant on QAT, and I > don't really know what can be done there. Maybe there's a debug mode. You > should possibly check with other versions of the engine (higher or lower). > If you're still experiencing the leak with an up to date version, you > should contact your intel support so that they can take a look at this. > > It's possible that haproxy triggers the leak (ie does something not cool > triggering an unusual code path leading to a leak in the engine), so they > may not be aware of it yet. > > Hoping this helps, > Willy > There is no specific memory allocation in the code of haproxy to handle the async engines. In addition, we are also using an other dfferent engine than QAT which is using a the "async engine mode" and we didn't notice any memory consumption. So the issue seems related to the openssl ".so" used for the engine. As Willy told you, you should try a different QAT version. R, Emeric
Re: [PATCH] dns: Handle SRV record weights correctly
On Tue, Jan 09, 2018 at 03:39:29PM +0100, Olivier Houchard wrote: > Updated patch attached. cool, now applied, thanks! Willy
Re: [PATCH] dns: Handle SRV record weights correctly
Hi, On Tue, Jan 09, 2018 at 03:28:22PM +0100, Olivier Houchard wrote: > Hi Willy, > > On Tue, Jan 09, 2018 at 03:17:24PM +0100, Willy Tarreau wrote: > > Hi Olivier, > > > > On Mon, Jan 08, 2018 at 04:35:35PM +0100, Olivier Houchard wrote: > > > Hi, > > > > > > The attached patch attempts to map SRV record weight to haproxy weight > > > correctly, > > > SRV weight goes from 0 to 65536 while haproxy uses 0 to 256, so we have to > > > divide it by 256, and a SRV weight of 0 doesn't mean the server shouldn't > > > be > > > used, so we use a minimum weight of 1. > > > > From what I'm seeing in the code, it's 0..65535 for the SRV record. And > > that allows us to simplify it and use the full range of the weight like > > this : > > > >hap_weight = srv_weight / 256 + 1; > > > > => 0..255 return 1 > >1..511 return 2 > >... > >65280..65535 return 256 > > > > What do you think ? > > > > Willy > > > > Sure, sounds good, for some reason I thought the max was 255, but it's > actually 256, one day I'll learn how to read C. > Updated patch attached. Regards, Olivier >From 7043d8da312605b2876286c95a2c0c53d6bd43e5 Mon Sep 17 00:00:00 2001 From: Olivier Houchard Date: Mon, 8 Jan 2018 16:28:57 +0100 Subject: [PATCH] MINOR: dns: Handle SRV record weight correctly. A SRV record weight can range from 0 to 65535, while haproxy weight goes from 0 to 255, so we have to divide it by 256 before handing it to haproxy. Also, a SRV record with a weight of 0 doesn't mean the server shouldn't be used, so use a minimum weight of 1. This should probably be backported to 1.8. --- include/types/dns.h | 2 +- src/dns.c | 19 --- 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/include/types/dns.h b/include/types/dns.h index b1f068a61..9b1d08df7 100644 --- a/include/types/dns.h +++ b/include/types/dns.h @@ -143,7 +143,7 @@ struct dns_answer_item { int16_t class; /* query class */ int32_t ttl; /* response TTL */ int16_t priority; /* SRV type priority */ - int16_t weight;/* SRV type weight */ + uint16_tweight;/* SRV type weight */ int16_t port; /* SRV type port */ int16_t data_len; /* number of bytes in target below */ struct sockaddr address; /* IPv4 or IPv6, network format */ diff --git a/src/dns.c b/src/dns.c index fceef2e48..a957710ed 100644 --- a/src/dns.c +++ b/src/dns.c @@ -522,10 +522,16 @@ static void dns_check_dns_response(struct dns_resolution *res) if (srv->srvrq == srvrq && srv->svc_port == item->port && item->data_len == srv->hostname_dn_len && !memcmp(srv->hostname_dn, item->target, item->data_len)) { - if (srv->uweight != item->weight) { + int ha_weight; + + /* Make sure weight is at least 1, so +* that the server will be used. +*/ + ha_weight = item->weight / 256 + 1; + if (srv->uweight != ha_weight) { char weight[9]; - snprintf(weight, sizeof(weight), "%d", item->weight); + snprintf(weight, sizeof(weight), "%d", ha_weight); server_parse_weight_change_request(srv, weight); } HA_SPIN_UNLOCK(SERVER_LOCK, &srv->lock); @@ -547,6 +553,7 @@ static void dns_check_dns_response(struct dns_resolution *res) if (srv) { const char *msg = NULL; char weight[9]; + int ha_weight; char hostname[DNS_MAX_NAME_SIZE]; if (dns_dn_label_to_str(item->target, item->data_len+1, @@ -563,7 +570,13 @@ static void dns_check_dns_response(struct dns_resolution *res) if ((srv->check.state & CHK_ST_CONFIGURED) && !(srv->flags & SRV_F_CHECKPORT)) srv->check.port = item->port; - snprintf(weight, sizeof(weight), "%d", item->weight); + + /* Make sure weight is at least 1, so +* that the server will be used. +*/ + ha_weight = item->weight / 256
Re: [PATCH] dns: Handle SRV record weights correctly
On Tue, Jan 09, 2018 at 03:28:22PM +0100, Olivier Houchard wrote: > Hi Willy, > > On Tue, Jan 09, 2018 at 03:17:24PM +0100, Willy Tarreau wrote: > > Hi Olivier, > > > > On Mon, Jan 08, 2018 at 04:35:35PM +0100, Olivier Houchard wrote: > > > Hi, > > > > > > The attached patch attempts to map SRV record weight to haproxy weight > > > correctly, > > > SRV weight goes from 0 to 65536 while haproxy uses 0 to 256, so we have to > > > divide it by 256, and a SRV weight of 0 doesn't mean the server shouldn't > > > be > > > used, so we use a minimum weight of 1. > > > > From what I'm seeing in the code, it's 0..65535 for the SRV record. And > > that allows us to simplify it and use the full range of the weight like > > this : > > > >hap_weight = srv_weight / 256 + 1; > > > > => 0..255 return 1 > >1..511 return 2 > >... > >65280..65535 return 256 > > > > What do you think ? > > > > Willy > > > > Sure, sounds good, for some reason I thought the max was 255, but it's > actually 256, one day I'll learn how to read C. I can only recommend you this one, which I hope will also help me write less bugs in haproxy once I finish it : http://www.c-for-dummies.com/ Willy
Re: [PATCH] dns: Handle SRV record weights correctly
Hi Willy, On Tue, Jan 09, 2018 at 03:17:24PM +0100, Willy Tarreau wrote: > Hi Olivier, > > On Mon, Jan 08, 2018 at 04:35:35PM +0100, Olivier Houchard wrote: > > Hi, > > > > The attached patch attempts to map SRV record weight to haproxy weight > > correctly, > > SRV weight goes from 0 to 65536 while haproxy uses 0 to 256, so we have to > > divide it by 256, and a SRV weight of 0 doesn't mean the server shouldn't be > > used, so we use a minimum weight of 1. > > From what I'm seeing in the code, it's 0..65535 for the SRV record. And > that allows us to simplify it and use the full range of the weight like > this : > >hap_weight = srv_weight / 256 + 1; > > => 0..255 return 1 >1..511 return 2 >... >65280..65535 return 256 > > What do you think ? > > Willy > Sure, sounds good, for some reason I thought the max was 255, but it's actually 256, one day I'll learn how to read C. Regards, Olivier
Re: haproxy+QAT memory usage very high under busy traffic
Hi Julian, On Tue, Jan 09, 2018 at 08:50:48AM +, Julian Zhu wrote: > We are testing haproxy+QAT card(Intel QuickAssist-Technology) and find that > the memory usage of haproxy+QAT is much higher than that of haproxy alone. > Under the same traffic (about 120K connections), haproxy alone only takes 6G > memory while haproxy+QAT takes 36G. > The only difference in config is as below for haproxy+QAT: > .. > ssl-engine qat > ssl-mode-async > .. > and the memory is not released even if we terminate all haproxy processes > until we reboot the server. Then this indicates that the leak (if any but it sounds like this) is in the engine. You'd need to unload and reload the modules to see if it's a real leak or just some extreme memory usage. > OS: > Centos-7 > > SW version: > haproxy-1.8.3 > openssl-1.1.0e > QAT_Engine: 0.5.20 > QAT1.7.Upstream.L.1.0.3_42 > > Does anyone encounter the same issue? Is there any way to debug or fix it? Unfortunately at this point it will be entirely dependant on QAT, and I don't really know what can be done there. Maybe there's a debug mode. You should possibly check with other versions of the engine (higher or lower). If you're still experiencing the leak with an up to date version, you should contact your intel support so that they can take a look at this. It's possible that haproxy triggers the leak (ie does something not cool triggering an unusual code path leading to a leak in the engine), so they may not be aware of it yet. Hoping this helps, Willy
Re: [PATCH 1/2] BUG/MINOR: lua: Fix default value for pattern in Socket.receive
Hi guys, On Mon, Jan 08, 2018 at 11:35:47AM +0100, Thierry Fournier wrote: > Hi Tim, > > Thanks for the patch. Good catch ! > Willy, you can apply it. OK thanks for the review, all 4 patches applied now. Willy
Re: [PATCH] dns: Handle SRV record weights correctly
Hi Olivier, On Mon, Jan 08, 2018 at 04:35:35PM +0100, Olivier Houchard wrote: > Hi, > > The attached patch attempts to map SRV record weight to haproxy weight > correctly, > SRV weight goes from 0 to 65536 while haproxy uses 0 to 256, so we have to > divide it by 256, and a SRV weight of 0 doesn't mean the server shouldn't be > used, so we use a minimum weight of 1. >From what I'm seeing in the code, it's 0..65535 for the SRV record. And that allows us to simplify it and use the full range of the weight like this : hap_weight = srv_weight / 256 + 1; => 0..255 return 1 1..511 return 2 ... 65280..65535 return 256 What do you think ? Willy
Re: 1.8.3 dns resolver ipv4/ipv6 undesirable behaviour
Marc Fournier writes: > Simply adding "resolve-prefer ipv4" makes the symptom go away, so no big > deal. But I wanted to point this out, as it might bite others, and I'm > pretty sure 1.7.x didn't have this issue. It turns out that "resolve-prefer ipv4" wasn't what fixed my problem after all. It was reordering the options on the server-template line which made the difference. So basically changing: server-template tutum 4 hello-world.tutum.rancher.internal:80 resolvers rancher check to: server-template tutum 4 hello-world.tutum.rancher.internal:80 check resolvers rancher With this change, tcpdump shows DNS queries/responses every 10s. Previously queries seemed to be sent only once, at start up/reload. If that's the expected behaviour, this statement in the documentation is misleading: The "server" and "default-server" keywords support a certain number of settings which are all passed as arguments on the server line. The order in which those arguments appear does not count, and they are all optional. nb: this problem seems to also happen with standard "server" definitions, not only with the new "server-template" one. Marc
Re: cannot bind socket - Need help with config file
Hello Imam, On Tue, Jan 9, 2018 at 2:30 AM, Imam Toufique wrote: > > Hi Jonathan, and Lucas, > > Thanks for your replies. With your help, I was able to get it work > partially. Please always CC the mailing list though. > frontend main *:2200 >#bind *:22 >default_backend sftp >timeout client 1h While this works, it's causing a lot of confusion. Please do follow my advice and DON'T specify the port in the frontend/listen line. Use the bind directive instead. So in this case: > frontend main >bind :2200 >default_backend sftp >timeout client 1h It's much more readable like this. > listen stats > #bind *:22 You disbled your stats section with this configuration. Either decide for a port, or remove it if you don't need it. > But haproxy starts and I was able to get ssh to one of the servers. Now I > have a different problem where I get a ssh ket fingerprint error warning and > my connection drops. > > I get the error below: > > [vagrant@db ~]$ ssh file -p 2200 > @@@ > @WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ > @@@ > IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! > Someone could be eavesdropping on you right now (man-in-the-middle attack)! > It is also possible that a host key has just been changed. > The fingerprint for the RSA key sent by the remote host is > SHA256:MHkXThp4cSltDn0/mRsq7Se+qcDz6cz1dD+kCiyE9e0. > Please contact your system administrator. > Add correct host key in /home/vagrant/.ssh/known_hosts to get rid of this > message. > Offending ECDSA key in /home/vagrant/.ssh/known_hosts:4 > RSA host key for [file]:2200 has changed and you have requested strict > checking. > Host key verification failed > > It looks like host keys are changing, and the host key becomes unknown to > both servers that are behind HAProxy. what do you recommend doing in a case > like this? That's what happens when you load-balance between 2 different SSH servers with a different private key. What is it that you want to achieve in the first place? cheers, lukas
haproxy+QAT memory usage very high under busy traffic
We are testing haproxy+QAT card(Intel QuickAssist-Technology) and find that the memory usage of haproxy+QAT is much higher than that of haproxy alone. Under the same traffic (about 120K connections), haproxy alone only takes 6G memory while haproxy+QAT takes 36G. The only difference in config is as below for haproxy+QAT: .. ssl-engine qat ssl-mode-async .. and the memory is not released even if we terminate all haproxy processes until we reboot the server. OS: Centos-7 SW version: haproxy-1.8.3 openssl-1.1.0e QAT_Engine: 0.5.20 QAT1.7.Upstream.L.1.0.3_42 Does anyone encounter the same issue? Is there any way to debug or fix it? Best Regards, Julian