Re: Bug: DNS changes in 1.7.3+ break UNIX socket stats in daemon mode with resolvers on FreeBSD
On Fri, May 12, 2017 at 08:58:56AM +0200, Lukas Tribus wrote: > Hi, > > > Am 11.05.2017 um 21:13 schrieb Jim Pingle: > > On 05/11/2017 01:58 PM, Frederic Lecaille wrote: > >> I have reproduced (at home) the stats socket issue within a FreeBSD 9.3 VM. > >> > >> Replacing your call to close() by fd_delete() which removes the fd from > >> the fd set used by kevent *and close it* seems to fix at least the stats > >> socket issue. I do not know if there are remaining ones. > >> > >> I did not reproduced the kevent issue revealed by Lukas traces. But I > >> had other ones : ERR#57 'Socket is not connected' during sendto(). > >> > >> I attached a temporary patch to be validated and to let you perhaps > >> provide a better one as I have not double check everything. > > Fred, > > > > That seems to have fixed the problem for me. With that patch applied, > > web traffic passes and the UNIX socket responds. > > Confirmed, works for me too. Baptiste? Willy? Is this an acceptable fix? Yes definitely, not only an acceptable one, but the right fix. I understand why it happens to work on linux, by default close() unregisters FDs from epoll so it passed below the radar. I'm expecting to spend the day to dig through the ton of pending patches and fixes, so if the queue is not too long, I should reach this one as well today :-) Cheers, Willy
Re: Reloading maps?
On Thu, May 11, 2017 at 04:23:14PM -0700, James Brown wrote: > Is there any good way to reload a map, short of either (a) reloading > haproxy every time the map changes, or (b) feeding the entire map into the > control socket as a series of `set map` statements? > > I've got a map generated by an external program; we're currently doing (b) > and it feels a little fragile... We could possibly imagine implementing a "bulk import" mode on the CLI to address this. We could imagine bringing atomicity this way. There are also alternatives consisting in periodically retrieving them from a URL, as implemented in the enterprise version, but we don't have this here. Willy
Re: Bug: DNS changes in 1.7.3+ break UNIX socket stats in daemon mode with resolvers on FreeBSD
On 05/12/2017 09:37 AM, Willy Tarreau wrote: On Fri, May 12, 2017 at 08:58:56AM +0200, Lukas Tribus wrote: Hi, Am 11.05.2017 um 21:13 schrieb Jim Pingle: On 05/11/2017 01:58 PM, Frederic Lecaille wrote: I have reproduced (at home) the stats socket issue within a FreeBSD 9.3 VM. Replacing your call to close() by fd_delete() which removes the fd from the fd set used by kevent *and close it* seems to fix at least the stats socket issue. I do not know if there are remaining ones. I did not reproduced the kevent issue revealed by Lukas traces. But I had other ones : ERR#57 'Socket is not connected' during sendto(). I attached a temporary patch to be validated and to let you perhaps provide a better one as I have not double check everything. Fred, That seems to have fixed the problem for me. With that patch applied, web traffic passes and the UNIX socket responds. Confirmed, works for me too. Baptiste? Willy? Is this an acceptable fix? Yes definitely, not only an acceptable one, but the right fix. I understand why it happens to work on linux, by default close() unregisters FDs from epoll so it passed below the radar. Ok so Willy I will send a well-formed patch asap.
Re: haproxy not creating stick-table entries fast enough
On Tue, May 09, 2017 at 09:43:22PM -0700, redundantl y wrote: > For example, I have tried with the latest versions of Firefox, Safari, and > Chrome. With 30 elements on the page being loaded from the server they're > all being loaded within 70ms of each other, the first 5 or so happening on > the same millisecond. I'm seeing similar behaviour, being sent to > alternating backend servers until it "settles" and sticks to just one. That's only true after the browser starts to retrieve the main page which gives it the indication that it needs to request such objects. You *always* have a first request before all other ones. The browser cannot guess it will have to retrieve many objects out of nowhere. The principle of stickiness is to ensure that subsequent requests will go to the same server that served the previous ones. The main goal is to ensure that all requests carrying a session cookie will end up on the server which holds this session. Here as Lukas explained, you're simulating a browser sending many totally independant requests in parallel. There's no reason (nor any way) that any equipment in the chain would guess they are related since they could arrive in any order, and even end up on multiple nodes. If despite this that's what you need (for a very obscure reason), then you'd rather use hashing for this. It will ensure that the same distribution algorithm is applied to all these requests regardless of their ordering. But let me tell you that it still makes me feel like you're trying to address the wrong problem. Also, most people prefer not to apply stickiness for static objects so that they can be retrieved in parallel from all static servers instead of all hammering the same server. It might possibly not be your case based on your explanation, but this is what people usually do for a better user experience. In conclusion, your expected use case still seem quite obscure to me :-/ Willy
Re: Bug: DNS changes in 1.7.3+ break UNIX socket stats in daemon mode with resolvers on FreeBSD
On Fri, May 12, 2017 at 09:48:56AM +0200, Frederic Lecaille wrote: > On 05/12/2017 09:37 AM, Willy Tarreau wrote: > > On Fri, May 12, 2017 at 08:58:56AM +0200, Lukas Tribus wrote: > > > Hi, > > > > > > > > > Am 11.05.2017 um 21:13 schrieb Jim Pingle: > > > > On 05/11/2017 01:58 PM, Frederic Lecaille wrote: > > > > > I have reproduced (at home) the stats socket issue within a FreeBSD > > > > > 9.3 VM. > > > > > > > > > > Replacing your call to close() by fd_delete() which removes the fd > > > > > from > > > > > the fd set used by kevent *and close it* seems to fix at least the > > > > > stats > > > > > socket issue. I do not know if there are remaining ones. > > > > > > > > > > I did not reproduced the kevent issue revealed by Lukas traces. But I > > > > > had other ones : ERR#57 'Socket is not connected' during sendto(). > > > > > > > > > > I attached a temporary patch to be validated and to let you perhaps > > > > > provide a better one as I have not double check everything. > > > > Fred, > > > > > > > > That seems to have fixed the problem for me. With that patch applied, > > > > web traffic passes and the UNIX socket responds. > > > > > > Confirmed, works for me too. Baptiste? Willy? Is this an acceptable fix? > > > > Yes definitely, not only an acceptable one, but the right fix. I understand > > why it happens to work on linux, by default close() unregisters FDs from > > epoll so it passed below the radar. > > Ok so Willy I will send a well-formed patch asap. Thanks Fred! Willy
Re: Bug: DNS changes in 1.7.3+ break UNIX socket stats in daemon mode with resolvers on FreeBSD
On 05/12/2017 09:52 AM, Willy Tarreau wrote: On Fri, May 12, 2017 at 09:48:56AM +0200, Frederic Lecaille wrote: On 05/12/2017 09:37 AM, Willy Tarreau wrote: On Fri, May 12, 2017 at 08:58:56AM +0200, Lukas Tribus wrote: Hi, Am 11.05.2017 um 21:13 schrieb Jim Pingle: On 05/11/2017 01:58 PM, Frederic Lecaille wrote: I have reproduced (at home) the stats socket issue within a FreeBSD 9.3 VM. Replacing your call to close() by fd_delete() which removes the fd from the fd set used by kevent *and close it* seems to fix at least the stats socket issue. I do not know if there are remaining ones. I did not reproduced the kevent issue revealed by Lukas traces. But I had other ones : ERR#57 'Socket is not connected' during sendto(). I attached a temporary patch to be validated and to let you perhaps provide a better one as I have not double check everything. Fred, That seems to have fixed the problem for me. With that patch applied, web traffic passes and the UNIX socket responds. Confirmed, works for me too. Baptiste? Willy? Is this an acceptable fix? Yes definitely, not only an acceptable one, but the right fix. I understand why it happens to work on linux, by default close() unregisters FDs from epoll so it passed below the radar. Ok so Willy I will send a well-formed patch asap. Thanks Fred! Willy Here is a more well-formed patch. Feel free to amend the commit message if not enough clear ;) Regards, Fred. >From e6c4a93bbc8838046ab9737bbd5d4be075a72393 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20L=C3=A9caille?= Date: Fri, 12 May 2017 09:57:15 +0200 Subject: [PATCH] BUG/MAJOR: dns: Broken kqueue events handling (BSD systems). Some DNS related network sockets were closed without unregistering their file descriptors from their underlying kqueue event sets. This patch replaces calls to close() by fd_delete() calls to that to delete such events attached to DNS network sockets from the kqueue before closing the sockets. --- src/dns.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dns.c b/src/dns.c index a118598..cb0a9a9 100644 --- a/src/dns.c +++ b/src/dns.c @@ -1004,7 +1004,7 @@ int dns_init_resolvers(int close_socket) if (close_socket == 1) { if (curnameserver->dgram) { - close(curnameserver->dgram->t.sock.fd); + fd_delete(curnameserver->dgram->t.sock.fd); memset(curnameserver->dgram, '\0', sizeof(*dgram)); dgram = curnameserver->dgram; } -- 2.1.4
Re: haproxy + RDP
El 11/05/17 a las 15:06, Aleksandar Lazic escribió: > .../ > How about to activate the 'option tcp-check' as mentioned in the > Warning? > In the config below is it's commented, any reason why? > > It's also active in the doc which you maybe know. > > https://www.haproxy.com/doc/aloha/7.0/deployment_guides/microsoft_remote_desktop_services.html > > Does this changes anything? ok cleaing up a liter I try: frontend RDP mode tcp bind *:3389 timeout client 1h tcp-request inspect-delay 5s tcp-request content accept if RDP_COOKIE default_backend bk_rdp # backend bk_rdp mode tcp balance leastconn #balance rdp_coockie timeout server 1h timeout connect 4s log global option tcplog stick-table type string len 32 size 10k expire 1h peers pares stick on rdp_cookie(msthash) # persist rdp-cookie option tcp-check # option ssl-hello-chk # option tcpka tcp-check connect port 3389 ssl # server gr43sterminal01 10.104.22.142:3389 weight 1 check verify none inter 2000 rise 2 fall 3 # server gr43sterminal02 10.104.23.141:3389 weight 1 check verify none inter 2000 rise 2 fall 3 # default-server inter 3s rise 2 fall 3 server gr43sterminal01 10.104.22.142:3389 weight 1 check server gr43sterminal02 10.104.23.141:3389 weight 1 check And I got: [ALERT] 131/100222 (8564) : Proxy 'bk_rdp', server 'gr43sterminal01' [/etc/haproxy/haproxy.cfg:189] verify is enabled by default but no CA file specified. If you're running on a LAN where you're certain to trust the server's certificate, please set an explicit 'verify none' statement on the 'server' line, or use 'ssl-server-verify none' in the global section to disable server-side verifications by default. [ALERT] 131/100222 (8564) : Proxy 'bk_rdp', server 'gr43sterminal02' [/etc/haproxy/haproxy.cfg:190] verify is enabled by default but no CA file specified. If you're running on a LAN where you're certain to trust the server's certificate, please set an explicit 'verify none' statement on the 'server' line, or use 'ssl-server-verify none' in the global section to disable server-side verifications by default. [ALERT] 131/100222 (8564) : Fatal errors found in configuration. So I try adding verify none in server line and haproxy see both server up (but one is down). I try withou ssl: tcp-check connect port 3389 server gr43sterminal01 10.104.22.142:3389 weight 1 check server gr43sterminal02 10.104.23.141:3389 weight 1 check but the result is the same haproxy see both server up (but one is down) only if I leve only option tcp-check (or none) it seem work # # persist rdp-cookie option tcp-check # option ssl-hello-chk # option tcpka # tcp-check connect port 3389 ssl # tcp-check connect port 3389 # server gr43sterminal01 10.104.22.142:3389 weight 1 check verify none inter 2000 rise 2 fall 3 # server gr43sterminal02 10.104.23.141:3389 weight 1 check verify none inter 2000 rise 2 fall 3 # default-server inter 3s rise 2 fall 3 server gr43sterminal01 10.104.22.142:3389 weight 1 check server gr43sterminal02 10.104.23.141:3389 weight 1 check ## output: [WARNING] 131/102105 (8773) : Server bk_rdp/gr43sterminal01 is DOWN, reason: Layer4 timeout, info: " at initial connection step of tcp-check", check duration: 3001ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue. -- *Antonio Trujillo Carmona* *Técnico de redes y sistemas.* *Subdirección de Tecnologías de la Información y Comunicaciones* Servicio Andaluz de Salud. Consejería de Salud de la Junta de Andalucía _antonio.trujillo.sspa@juntadeandalucia.es_ Tel. +34 670947670 747670)
Re: haproxy + RDP
Hi Antonio Trujillo Carmona. Antonio Trujillo Carmona have written on Fri, 12 May 2017 10:23:59 +0200: > El 11/05/17 a las 15:06, Aleksandar Lazic escribió: > > .../ > > How about to activate the 'option tcp-check' as mentioned in the > > Warning? > > In the config below is it's commented, any reason why? > > > > It's also active in the doc which you maybe know. > > > > https://www.haproxy.com/doc/aloha/7.0/deployment_guides/microsoft_remote_desktop_services.html > > > > Does this changes anything? > ok cleaing up a liter I try: > frontend RDP > mode tcp > bind *:3389 > timeout client 1h > tcp-request inspect-delay 5s > tcp-request content accept if RDP_COOKIE > default_backend bk_rdp > # > backend bk_rdp > mode tcp > balance leastconn > #balance rdp_coockie > timeout server 1h > timeout connect 4s > log global > option tcplog > stick-table type string len 32 size 10k expire 1h peers pares > stick on rdp_cookie(msthash) > # persist rdp-cookie > option tcp-check > # option ssl-hello-chk > # option tcpka > tcp-check connect port 3389 ssl > > # server gr43sterminal01 10.104.22.142:3389 weight 1 check > verify none inter 2000 rise 2 fall 3 > # server gr43sterminal02 10.104.23.141:3389 weight 1 check > verify none inter 2000 rise 2 fall 3 > # > default-server inter 3s rise 2 fall 3 > server gr43sterminal01 10.104.22.142:3389 weight 1 check > server gr43sterminal02 10.104.23.141:3389 weight 1 check > > And I got: > [ALERT] 131/100222 (8564) : Proxy 'bk_rdp', server 'gr43sterminal01' > [/etc/haproxy/haproxy.cfg:189] verify is enabled by default but no CA > file specified. If you're running on a LAN where you're certain to > trust the server's certificate, please set an explicit 'verify none' > statement on the 'server' line, or use 'ssl-server-verify none' in > the global section to disable server-side verifications by default. > [ALERT] 131/100222 (8564) : Proxy 'bk_rdp', server 'gr43sterminal02' > [/etc/haproxy/haproxy.cfg:190] verify is enabled by default but no CA > file specified. If you're running on a LAN where you're certain to > trust the server's certificate, please set an explicit 'verify none' > statement on the 'server' line, or use 'ssl-server-verify none' in > the global section to disable server-side verifications by default. > [ALERT] 131/100222 (8564) : Fatal errors found in configuration. > > So I try adding verify none in server line > > and haproxy see both server up (but one is down). > I try withou ssl: > > tcp-check connect port 3389 > server gr43sterminal01 10.104.22.142:3389 weight 1 check > server gr43sterminal02 10.104.23.141:3389 weight 1 check > > but the result is the same haproxy see both server up (but one is > down) > > only if I leve only option tcp-check (or none) it seem work > > > # > # persist rdp-cookie > option tcp-check > # option ssl-hello-chk > # option tcpka > # tcp-check connect port 3389 ssl > # tcp-check connect port 3389 > > # server gr43sterminal01 10.104.22.142:3389 weight 1 check > verify none inter 2000 rise 2 fall 3 > # server gr43sterminal02 10.104.23.141:3389 weight 1 check > verify none inter 2000 rise 2 fall 3 > # > default-server inter 3s rise 2 fall 3 > server gr43sterminal01 10.104.22.142:3389 weight 1 check > server gr43sterminal02 10.104.23.141:3389 weight 1 check > ## > > > output: > > [WARNING] 131/102105 (8773) : Server bk_rdp/gr43sterminal01 is DOWN, > reason: Layer4 timeout, info: " at initial connection step of > tcp-check", check duration: 3001ms. 1 active and 0 backup servers > left. 0 sessions active, 0 requeued, 0 remaining in queue. So finally it works. Regards Aleks
[PATCH] MINOR: ssl: support ssl-min-ver and ssl-max-ver with crt-list
Hi, This patch depend of " [Patches] TLS methods configuration reworked ». Actually it will only work with BoringSSL because haproxy use a special ssl_sock_switchctx_cbk with a BoringSSL callback to select certificat before any handshake negotiation. This feature (and others depend of this ssl_sock_switchctx_cbk) could work with openssl 1.1.1 and the new callback https://www.openssl.org/docs/manmaster/man3/SSL_CTX_set_early_cb.html. ++ Manu 0001-MINOR-ssl-support-ssl-min-ver-and-ssl-max-ver-with-c.patch Description: Binary data
Re: [Patches] TLS methods configuration reworked
Hi guys, On Tue, May 09, 2017 at 11:21:36AM +0200, Emeric Brun wrote: > It seems to do what we want, so we can merge it. So the good news is that this patch set now got merged :-) Thanks for your time and efforts back-and-forth on this one! Willy
Re: [PATCH v3] MINOR: ssl: add prefer-client-ciphers
On Thu, May 04, 2017 at 03:45:40PM +, Lukas Tribus wrote: > Currently we unconditionally set SSL_OP_CIPHER_SERVER_PREFERENCE [1], > which may not always be a good thing. (...) Now merged, thank you Lukas! Willy
Re: [PATCH]: CLEANUP/MINOR: retire obsoleted USE_GETSOCKNAME build option
On Thu, May 11, 2017 at 01:04:50PM +0300, Dmitry Sivachenko wrote: > Hello, > > this is a patch to nuke obsoleted USE_GETSOCKNAME build option. Applied, thanks Dmitry. BTW, your attached patch was strangely missing a header so I rewrote the commit message since this one was not too hard to guess. Willy
Re: Bug: DNS changes in 1.7.3+ break UNIX socket stats in daemon mode with resolvers on FreeBSD
On Fri, May 12, 2017 at 10:20:56AM +0200, Frederic Lecaille wrote: > Here is a more well-formed patch. > Feel free to amend the commit message if not enough clear ;) It was clear enough, thanks. I added the mention of the faulty commit, that helps tracking backports and credited Jim and Lukas for the investigations. Thanks, Willy
Re: Quick (hopefully) question about clearing stick table entry
Hi Franks, On Wed, May 10, 2017 at 10:29:08AM +, Franks Andy (IT Technical Architecture Manager) wrote: > Hi all, > Is there a way to clear a stick table entry (using socat obviously) by > referring to the individual 'reference' id given at the beginning of the > entry, e.g. "0x7faef417d3ec" ? > Looking at the manual it seems the clearing function is based on key (ip in > my case) or data field - server id etc. I could use the key, but I'm not sure > this will always be individual - I may not always use "stick on src". > Maybe I'm confused :) and IP key IS the best. It's not possible to kill by reference like this. However it would be a bad idea since the entry could be purged and reassigned while you're doing it, resulting in your operation to kill the wrong one. Killing by key remains better as it provides a form of atomicity in the operation. willy
Re: Failed to compile haproxy with lua on Solaris 10
Hi Benoît, On Thu, May 04, 2017 at 08:50:33AM +0200, Benoît GARNIER wrote: (...) > If you do the following operation : time_t => localtime() => struct tm > => timegm() => time_t, your result will be shift by the timezone time > offset (but without any DST applied). > > Technically, if you live in Great Britain, the operation will succeed > during winter (but will offset the result by 1 hour during summer, since > DST is applied here). So in short you're saying that we should always use mktime() instead ? Willy
Re: [PATCH] Add b64dec sample converter
Hi Holger, On Sat, May 06, 2017 at 02:08:29AM +0200, Holger Just wrote: > This patch against current master adds a new b64dec converter. It takes > a base64 encoded string and returns its decoded binary representation. > > This converter can be used to e.g. extract the username of a basic auth > header to add it to the log: > > acl BASIC_AUTH hdr_beg(Authorization) "Basic " > http-request capture hdr(Authorization),regsub(^Basic\ ,),b64dec if > BASIC_AUTH It's so obvious it doesn't even need a justification indeed! I even thought we already had it! > I'm open for suggestions for a better name for the converter. > base64_decode might work but doesn't suit the code formatting well and > is pretty long... I didn't find a better one either. > As a note to reviewers: please be aware that I'm not a C programmer at > all and I am way outside of my comfort zone here. As such, this function > might have unhandled edge-cases. I tried to model it according to the > existing base64 converter and my understanding of how the converters are > supposed to work but might have missed something. Thanks for the warning, much appreciated. It made me re-read it after applying it. But your code is fine, no problem detected! So you're becoming a C programmer ;-) > Once verified, I think this converter can be safely added to the > supported stable versions of HAProxy. Yes I think it can make sense to backport it at least to 1.7, it can help sometimes. Thanks! Willy
Re: Passing SNI value ( ssl_fc_sni ) to backend's verifyhost.
On Tue, May 09, 2017 at 12:12:42AM +0200, Lukas Tribus wrote: > Haproxy can verify the certificate of backend TLS servers since day 1. > > The only thing missing is client SNI based backend certificate > verification, which yes - since we can pass client SNI to the TLS server > - we need to consider for the certificate verification process as well. In fact the cert name is checked, it's just that it can only check against a constant in the configuration. I agree that it's a problem when using SNI. Furthermore it forces one to completely disable verifyhost in case SNI is used. I tend to think that the best approach would be to always enable it when SNI is involved in fact, because if SNI is used to the server, it really means we want to check what cert is provided. This could then possibly be explicitly turned off by the "verify none" directive. I have absolutely no idea how to do that however, I don't know if we can retrieve the previously configured SNI using openssl's API after the connection is established. Willy
Re: [RFC][PATCHES] seamless reload
Hi Pavlos, Olivier, On Mon, May 08, 2017 at 02:34:05PM +0200, Olivier Houchard wrote: > Hi Pavlos, > > On Sun, May 07, 2017 at 12:05:28AM +0200, Pavlos Parissis wrote: > [...] > > Ignore ignore what I wrote, I am an idiot I am an idiot as I forgot the most > > important bit of the test, to enable the seamless reload by suppling the > > HAPROXY_STATS_SOCKET environment variable:-( > > > > I added to the systemd overwrite file: > > [Service] > > Environment=CONFIG="/etc/lb_engine/haproxy.cfg" > > "HAPROXY_STATS_SOCKET=/run/lb_engine/process-1.sock" > > > > and wrk2 reports ZERO errors where with HAPEE reports ~49. > > > > I am terrible sorry for this stupid mistake. > > > > But, this mistake revealed something interesting. The fact that with the > > latest > > code we have more errors during reload. > > > > @Olivier, great work dude. I am waiting for this to be back-ported to > > HAPEE-1.7r1. > > > > Once again I am sorry for my mistake, > > Pavlos > > > > Thanks a lot for testing ! > This is interesting indeed. My patch may make it worse when not passing > fds via the unix socket, as all processes now keep all sockets opened, even > the one they're not using, maybe it make the window between the last > accept and the close bigger. That's very interesting indeed. In fact it's the window between the last accept and the *last* close, due to processes holding the socket while not being willing to do accept anything on it. > If that is so, then the global option "no-unused-socket" should provide > a comparable error rate. In fact William is currently working on the master-worker model to get rid of the systemd-wrapper and found some corner cases between this and your patchset. Nothing particularly difficult, just the fact that he'll need to pass the path to the previous socket to the new processes during reloads. During this investigation it was found that we'd need to be able to say that a process possibly has no stats socket and that the next one will not be able to retrieve the FDs. Such information cannot be passed from the command line since it's a consequence of the config parsing. Thus we thought it would make sense to have a per-socket option to say whether or not it would be usable for offering the listening file descriptors, just like we currently have an administrative level on them (I even seem to remember that Olivier first asked if we wouldn't need to do this). And suddenly a few benefits appear when doing this : - security freaks not willing to expose FDs over the socket would simply not enable them ; - we could restrict the number of processes susceptible of exposing the FDs simply by playing with the "process" directive on the socket ; that could also save some system-wide FDs ; - the master process could reliably find the socket's path in the conf (the first one with this new directive enabled), even if it's changed between reloads ; - in the default case (no specific option) we wouldn't change the existing behaviour so it would not make existing reloads worse. Pavlos, regarding the backport to your beloved version, that's planned, but as you can see, while the main technical issues have already been sorted out, there will still be a few small integration-specific changes to come, which is why for now it's still on hold until all these details are sorted out once for all. Best regards, Willy
[PATCH] Lua medium bugfix
Hi, A patch fixing a medium bugfix in attachment. The backport in 1.6 and 1.7 is easy: it doesn't generate conflicts. In the case of a Lua sample-fetch or converter doesn't return any value, an acces outside the Lua stack can be performed. This patch check the stack size before converting the top value to a HAProxy internal sample. A workaround consist to check that a value value is always returned with sample fetches and converters. This patch should be backported in the version 1.6 and 1.7 Thierry >From cad53b6e6e2a35202f8086d3239dc2f8891d8944 Mon Sep 17 00:00:00 2001 From: Thierry FOURNIER Date: Fri, 12 May 2017 16:32:20 +0200 Subject: [PATCH] BUG/MEDIUM: lua: segfault if a converter or a sample doesn't return anything In the case of a Lua sample-fetch or converter doesn't return any value, an acces outside the Lua stack can be performed. This patch check the stack size before converting the top value to a HAProxy internal sample. A workaround consist to check that a value value is always returned with sample fetches and converters. This patch should be backported in the version 1.6 and 1.7 --- src/hlua.c |8 1 file changed, 8 insertions(+) diff --git a/src/hlua.c b/src/hlua.c index 643d3fc..b8d2c88 100644 --- a/src/hlua.c +++ b/src/hlua.c @@ -5496,6 +5496,10 @@ static int hlua_sample_conv_wrapper(const struct arg *arg_p, struct sample *smp, switch (hlua_ctx_resume(stream->hlua, 0)) { /* finished. */ case HLUA_E_OK: + /* If the stack is empty, the function fails. */ + if (lua_gettop(stream->hlua->T) <= 0) + return 0; + /* Convert the returned value in sample. */ hlua_lua2smp(stream->hlua->T, -1, smp); lua_pop(stream->hlua->T, 1); @@ -5617,6 +5621,10 @@ static int hlua_sample_fetch_wrapper(const struct arg *arg_p, struct sample *smp stream_int_retnclose(&stream->si[0], &msg); return 0; } + /* If the stack is empty, the function fails. */ + if (lua_gettop(stream->hlua->T) <= 0) + return 0; + /* Convert the returned value in sample. */ hlua_lua2smp(stream->hlua->T, -1, smp); lua_pop(stream->hlua->T, 1); -- 1.7.10.4
Re: Automatic Certificate Switching Idea
Hi, On Tue, May 09, 2017 at 07:04:01PM +0200, Daniel Schneller wrote: > Hi! > > > On 9. May. 2017, at 00:30, Lukas Tribus wrote: > > > > [...] > > I'm opposed to heavy feature-bloating for provisioning use-cases, that > > can quite easily fixed where the fix belongs - the provisioning layer. > > You are right, that this can be handled outside / in the provisioning layer. > And I have no problem implementing it there, if it is considered too narrow a > niche feature. However, I was curious to see, if this is something that other > people also need constantly -- sometimes you believe you are in a specific > bubble, but aren't. But from the amount of feedback the original post > generated, I think I know my anser already ;-) In fact I'm less opposed than Lukas here given that I have no idea of the possible impacts nor complexity (but I don't want to have the complete MS Office suite merged in, just Word, Excel and PowerPoint :-)). I'd tend to say that since we're progressively evolving in a more dynamic world where users want the ability to perform *some* updates without reloading, the day we realize that 90% of the haproxy reloads are only caused by cert updates, we need to think about a way to address this. I remember that Thierry started to look at how to feed a cert from the CLI but apparently it was everything but obvious. Loading multiple certs could be nice in theory, but there are a few shortcomings to keep in mind : - for embedded users you don't want haproxy's date check to become strict because it's frequent that such devices have a totally wrong date. Or at least you want to ensure that you always keep the most recent cert and never kill any outdated one. - renewed certs can and will sometimes provide extra alt names, so they are not always 100% equivalent. - renewed certs will also change the key size once in a while, and sometimes the algorithm. Technically speaking it might cause difficulties to change this on the fly, or at least some verifications have to be performed at load time. - I think that most of the crt-list config is per-certificate file and not per-name. That might also make certain things more complicated to configure That said, given that we can already look up a cert based on a name, maybe in fact we could load all of them and just try to find a more recent one if the first one reported by the SNI is outdated. I don't know if that solves everything there. In any case, this will not provide any benefit regarding let's encrypt or such solutions, because the next cert would have to be known in advance and loaded already, so reloads will have to be performed to take it into account. So I think that the approach making it possible to feed them over the CLI would still be mor interesting (and possibly complementary). It could be interesting to study what it would require to implement a "strict-date" option or something like this per certificate to enable checking of their validity during the pick-up. Still, one point has to be kept in mind. Daniel I'm pretty sure that most users would prefer the approach consisting in picking the most recent valid cert instead of the last one as you'd like. I don't really know if it's common to issue a cert with a "not-before" date in the future. And that might be the whole point in the end. Hoping this helps, Willy
Re: [PATCH] Lua medium bugfix
On Fri, May 12, 2017 at 04:41:48PM +0200, Thierry Fournier wrote: > Hi, > > A patch fixing a medium bugfix in attachment. > The backport in 1.6 and 1.7 is easy: it doesn't generate conflicts. > >In the case of a Lua sample-fetch or converter doesn't return any >value, an acces outside the Lua stack can be performed. This patch >check the stack size before converting the top value to a HAProxy >internal sample. (...) Applied, thank you Thierry. Willy
Re: Limiting bandwidth of connections
Hi Robin, On Wed, May 10, 2017 at 09:15:44PM +, Robin H. Johnson wrote: > Hi, > > I'm wondering about the status of bandwidth limiting that was originally > planned for 1.6. > > In the archives I see discussions in 2012 & 2013; Willy's responses: > 2012-04-17 planned for 1.6: > https://www.mail-archive.com/haproxy@formilux.org/msg07096.html > 2013-05-01 planned for 1.6: > https://www.mail-archive.com/haproxy@formilux.org/msg09812.html Several of us would like to get it done, and with filters it should be easy and even fun. The most difficult I guess is to define the different limiting classes we want so that we don't change the configuration every other day. What is currently sure has already been requested : - per connection limits (ie: no user can take more than X MBps) - per frontend limits (ie: no hosted service can take more than X Mbps) - per backend limits (ie: no hosted customer of a virtual server can take more than X) - per process limits (ie: limit total outgoing bandwidth) - per track-sc limit (ie: track an entry in a table and the bandwidth there is shared) so that more complex criteria can be set The 1st and 4th ones have been the most demanded. The first one to limit hungry users and save bandwidth (eg: don't let them preload too much data that they're going to drop when clicking stop on a sound or video player). The fourth one to avoid network drops when the external bandwidth is itself capped on network equipments. You may want to take a stab at this, the filters API is well documented and that may give you some hints about some points we have to think about. We definitely need to make progress on this stuff that has been promised since 1.4.x but never considered urgent. Willy
Re: [PATCH] Add b64dec sample converter
Hi Willy, thanks for applying the patch! Willy Tarreau wrote: > Thanks for the warning, much appreciated. It made me re-read it after > applying it. But your code is fine, no problem detected! So you're > becoming a C programmer ;-) Yeah, we will see about that :) >> Once verified, I think this converter can be safely added to the >> supported stable versions of HAProxy. > > Yes I think it can make sense to backport it at least to 1.7, it can > help sometimes. That would be much appreciated. I think a backport even down to 1.6 is pretty risk-free given that the structure there hasn't changed much lately and the patch applies cleanly even on 1.6.0. Cheers, Holger
Re: [PATCH] Add b64dec sample converter
On Fri, May 12, 2017 at 05:39:28PM +0200, Holger Just wrote: > >> Once verified, I think this converter can be safely added to the > >> supported stable versions of HAProxy. > > > > Yes I think it can make sense to backport it at least to 1.7, it can > > help sometimes. > > That would be much appreciated. I think a backport even down to 1.6 is > pretty risk-free given that the structure there hasn't changed much > lately and the patch applies cleanly even on 1.6.0. I have no fear about it being backported to 1.6, the thing is that we normally don't backport any feature anymore to stable branches due to the terrible experience in 1.4 where too much riskless stuff was backported, then fixed, then removed etc... making each subsequent version a pain for certain users. In practice we tend to be a bit flexible and to backport very small stuff that makes people life's easier or the whole process more reliable (eg: config warnings, ability to quit after a delay on reload), but clearly I don't want to do it past the last release. The reason is simple : if some users are still on 1.6 instead of 1.7, it's precisely because they don't want to get a single change unless it's a real bug. There are some places where changelogs and patches are all read one by one (and non-reg tests run for some time) before deciding to upgrade. And this gives incentive to users of older releases to start to consider new ones :-) Cheers, Willy
Re: [PATCH] Add b64dec sample converter
Hi Willy, Willy Tarreau wrote: > The thing is that we normally don't backport any feature anymore to > stable branches due to the terrible experience in 1.4 where too much > riskless stuff was backported, then fixed, then removed etc... making > each subsequent version a pain for certain users. > > [...] > > And this gives incentive to users of older releases to start to > consider new ones :-) Those are all very good reasons for not backporting the patch. I hadn't considered that an exception is usually not important enough to break the default of having rock solid stable versions. That is indeed a very good hard rule to have in a software maintainer's handbook. Thanks for taking the time to explain your reasoning and also for saying no. Cheers, Holger
Re: Automatic Certificate Switching Idea
Willy, thanks for your elaborate reply! See my remarks below. > possible impacts nor complexity (but I don't want to have the complete MS > Office suite merged in, just Word, Excel and PowerPoint :-)). :-D > - renewed certs can and will sometimes provide extra alt names, so >they are not always 100% equivalent. > […] > That said, given that we can already look up a cert based on a name, > maybe in fact we could load all of them and just try to find a more > recent one if the first one reported by the SNI is outdated. I don't > know if that solves everything there. It actually might. In the end it would be something like a map, with the key being the domain, and the value a list of pointers to the actual certificates, sorted by remaining validity, having shortest first. > In any case, this will not provide any benefit regarding let's encrypt > or such solutions, because the next cert would have to be known in > advance and loaded already, so reloads will have to be performed to > take it into account. So I think that the approach making it possible > to feed them over the CLI would still be mor interesting (and possibly > complementary). I think it would benefit Let’s Encrypt and similar scenarios. I would still require reloads to pick up newly added certificates. But as renewed certificates overlap their predecessors’ validity period, dropping them into a directory and just doing a reload maybe once a day would work. Clients would still get the older one, until it finally expired, but that should not matter, as we are not talking about revocations where switching to a new cert is wanted quickly. > Daniel I'm pretty sure that most users > would prefer the approach consisting in picking the most recent > valid cert instead of the last one as you'd like. I don't really > know if it's common to issue a cert with a "not-before" date in the > future. And that might be the whole point in the end. Well, I was just thinking about the not-after date. In general, from a client perspective it shouldn’t matter to get an older one, until it really expires. And the case where you have a new certificate already, and you want it handed out to clients ASAP is already taken care of today — just replace the file and reload :-) Unless I misunderstood what you meant when referring to the “not-before” date. Daniel PS: This is an interesting discussion, and I am happy to continue it, if anyone feels the same. As I said, I will try to solve this via provisioning scripts in the meantime, so there is no time pressure. -- Daniel Schneller Principal Cloud Engineer CenterDevice GmbH | Hochstraße 11 | 42697 Solingen tel: +49 1754155711| Deutschland daniel.schnel...@centerdevice.de | www.centerdevice.de Geschäftsführung: Dr. Patrick Peschlow, Dr. Lukas Pustina, Michael Rosbach, Handelsregister-Nr.: HRB 18655, HR-Gericht: Bonn, USt-IdNr.: DE-815299431
hostname to IP converter possible?
Hi list, Is now there's a converter for hostname to IPv4 available in haproxy? Regards, Igor
Re: Automatic Certificate Switching Idea
On Fri, May 12, 2017 at 06:42:20PM +0200, Daniel Schneller wrote: > > That said, given that we can already look up a cert based on a name, > > maybe in fact we could load all of them and just try to find a more > > recent one if the first one reported by the SNI is outdated. I don't > > know if that solves everything there. > > > It actually might. In the end it would be something like a map, with the > key being the domain, and the value a list of pointers to the actual > certificates, sorted by remaining validity, having shortest first. That's already what is done in the SNI trees, except that the validity date is not considered, the first one matching is retrieved. > I think it would benefit Let's Encrypt and similar scenarios. I would > still require reloads to pick up newly added certificates. But as renewed > certificates overlap their predecessors' validity period, dropping them > into a directory and just doing a reload maybe once a day would work. > Clients would still get the older one, until it finally expired, but that > should not matter, as we are not talking about revocations where > switching to a new cert is wanted quickly. Using the old one "until it expires" is what really causes me a problem (and I understand that in your case that's what you need). There are several reasons for prefering the latest one instead : - it might provide stronger algorithms - it might use a CA which is not being blacklisted (remember that people started to complain about haproxy.org causing them some warnings because the CA was considered unsafe) - it was issued in the past (minutes, hours, days) so is likely already valid regardless of any small time shift. Using the old one one minute past its validity date will be a big problem. - the change will be effective at the moment of reload, meaning that any surprize like an incomplete chain, incorrect OCSP, key size incompatible with certain browsers, will be identified at an expected moment and when it's not too late to fix it. By using the oldest one as long as possible, it would break at any time in the middle of the night and would do it once you cannot roll back. And that's the point. Users praise haproxy's reliability but in fact it's not (just) the code's reliability (git log --grep BUG shames us), but the fact that it has always been designed to be used by humans, who make mistakes and who want to spot them very quickly and to fix them before they become a big trouble. Config warnings/errors, checks for suspicious constructs and logs are directly involved here. And we do know that our users occasionally fail and we must help them recover, and even possibly cover their mistakes before the boss or the customer has any chance to notice. So creating something designed to fail by default in their back without prior notice and without the ability to quickly stop before anyone notices is contrary to the philosophy here. That doesn't mean that what you need must not be implemented, it means that under no circumstance it should be the default nor happen to be enabled by default. Thus I think that at minima if we ever go in that direction, the default behaviour must be the expected one (ie: use the most recent valid cert), and maybe there could be an option to prefer the old one instead and to apply a date margin (eg: avoid using this one if there's less than a day left). (...) > PS: This is an interesting discussion, and I am happy to continue > it, if anyone feels the same. I would not be surprized if we get some followups in either direction. Over the mid term, more and more people will be affected by related situations and the whole aspect of cert renewal will eventually become hot. But I strongly doubt we'll do anything for this in 1.8, though collecting views, ideas and constraints can be useful to try to serve everyone the best later. > As I said, I will try to solve this via > provisioning scripts in the meantime, so there is no time pressure. That's perfect! Your feedback and possible trouble in doing this will also definitely help! thanks, Willy
Re: haproxy not creating stick-table entries fast enough
On Fri, May 12, 2017 at 12:51 AM, Willy Tarreau wrote: > On Tue, May 09, 2017 at 09:43:22PM -0700, redundantl y wrote: > > For example, I have tried with the latest versions of Firefox, Safari, > and > > Chrome. With 30 elements on the page being loaded from the server > they're > > all being loaded within 70ms of each other, the first 5 or so happening > on > > the same millisecond. I'm seeing similar behaviour, being sent to > > alternating backend servers until it "settles" and sticks to just one. > > That's only true after the browser starts to retrieve the main page which > gives it the indication that it needs to request such objects. You *always* > have a first request before all other ones. The browser cannot guess it > will have to retrieve many objects out of nowhere. > > As I've said before, the issue here is these objects aren't hosted on the same server that they're being called from. "A separately hosted application will generate HTML with several (20-30) elements that will be loaded simultaneously by the end user's browser." So a user might go to www.example.com and that page will load the objects from assets.example.com, which is a wholly separate server. > The principle of stickiness is to ensure that subsequent requests will go > to the same server that served the previous ones. The main goal is to > ensure that all requests carrying a session cookie will end up on the > server which holds this session. > > Here as Lukas explained, you're simulating a browser sending many totally > independant requests in parallel. There's no reason (nor any way) that > any equipment in the chain would guess they are related since they could > arrive in any order, and even end up on multiple nodes. > > Well, all of these requests will have the url_param email=, so the load balancer has the ability to know they are related. The issue here, at least how it appears to me, is since they come in so fast the stick-table entry doesn't get generated quickly enough and the requests get distributed to multiple backend servers and eventually stick to just one. > If despite this that's what you need (for a very obscure reason), then > you'd rather use hashing for this. It will ensure that the same > distribution > algorithm is applied to all these requests regardless of their ordering. > But > let me tell you that it still makes me feel like you're trying to address > the wrong problem. > > Since changing to load balancing on the url_param our issue has been resolved. > Also, most people prefer not to apply stickiness for static objects so that > they can be retrieved in parallel from all static servers instead of all > hammering the same server. It might possibly not be your case based on your > explanation, but this is what people usually do for a better user > experience. > > The objects aren't static. When they're loaded the application makes some calls to external services (3rd party application, database server) to produce the desired objects and links. > In conclusion, your expected use case still seem quite obscure to me :-/ > > Willy > I agree, our use case is fairly unique.
Re: haproxy not creating stick-table entries fast enough
On Fri, May 12, 2017 at 10:20:02AM -0700, redundantl y wrote: > As I've said before, the issue here is these objects aren't hosted on the > same server that they're being called from. > > "A separately hosted application will generate HTML with several (20-30) > elements that will be loaded simultaneously by the end user's browser." > > So a user might go to www.example.com and that page will load the objects > from assets.example.com, which is a wholly separate server. OK but *normally* if there's parallelism when downloading objects from assets.example.com, then there's no dependency between them. > > The principle of stickiness is to ensure that subsequent requests will go > > to the same server that served the previous ones. The main goal is to > > ensure that all requests carrying a session cookie will end up on the > > server which holds this session. > > > > Here as Lukas explained, you're simulating a browser sending many totally > > independant requests in parallel. There's no reason (nor any way) that > > any equipment in the chain would guess they are related since they could > > arrive in any order, and even end up on multiple nodes. > > > > > Well, all of these requests will have the url_param email=, so the load > balancer has the ability to know they are related. The issue here, at > least how it appears to me, is since they come in so fast the stick-table > entry doesn't get generated quickly enough and the requests get distributed > to multiple backend servers and eventually stick to just one. It's not fast WRT the stick table but WRT the time to connect to the server. As I mentionned, the principle of stickiness is to send subsequent requests to the same server which *served* the previous ones. So if the first request is sent to server 1, the connection fails several times, then it's redispathed to server 2 and succeeds, it will be server 2 which will be put into the table so that next connections will go there as well. In your workload, there isn't even the time to validate the connection to the server, and *this* is what causes the problem you're seeing. > Since changing to load balancing on the url_param our issue has been > resolved. So indeed you're facing the type of workloads requiring a hash. > > Also, most people prefer not to apply stickiness for static objects so that > > they can be retrieved in parallel from all static servers instead of all > > hammering the same server. It might possibly not be your case based on your > > explanation, but this is what people usually do for a better user > > experience. > > > > > The objects aren't static. When they're loaded the application makes some > calls to external services (3rd party application, database server) to > produce the desired objects and links. OK I see. Then better stick to the hash using url_param. You can improve this by combining it with stick anyway if your url_params are frequently reused (eg: many requests per client). This will avoid redistributing innocent connections in the event a server is added or removed due to the hash being recomputed. That can be especially true of your 3rd party application sometimes has long response times and the probability of a server outage between the first and the last request for a client becomes high. > > In conclusion, your expected use case still seem quite obscure to me :-/ > > > > Willy > > > > I agree, our use case is fairly unique. It looks so :-) Willy
Re: haproxy not creating stick-table entries fast enough
On Fri, May 12, 2017 at 10:46 AM, Willy Tarreau wrote: > On Fri, May 12, 2017 at 10:20:02AM -0700, redundantl y wrote: > > As I've said before, the issue here is these objects aren't hosted on the > > same server that they're being called from. > > > > "A separately hosted application will generate HTML with several (20-30) > > elements that will be loaded simultaneously by the end user's browser." > > > > So a user might go to www.example.com and that page will load the > objects > > from assets.example.com, which is a wholly separate server. > > OK but *normally* if there's parallelism when downloading objects from > assets.example.com, then there's no dependency between them. > > > > The principle of stickiness is to ensure that subsequent requests will > go > > > to the same server that served the previous ones. The main goal is to > > > ensure that all requests carrying a session cookie will end up on the > > > server which holds this session. > > > > > > Here as Lukas explained, you're simulating a browser sending many > totally > > > independant requests in parallel. There's no reason (nor any way) that > > > any equipment in the chain would guess they are related since they > could > > > arrive in any order, and even end up on multiple nodes. > > > > > > > > Well, all of these requests will have the url_param email=, so the load > > balancer has the ability to know they are related. The issue here, at > > least how it appears to me, is since they come in so fast the stick-table > > entry doesn't get generated quickly enough and the requests get > distributed > > to multiple backend servers and eventually stick to just one. > > It's not fast WRT the stick table but WRT the time to connect to the > server. > As I mentionned, the principle of stickiness is to send subsequent requests > to the same server which *served* the previous ones. So if the first > request > is sent to server 1, the connection fails several times, then it's > redispathed > to server 2 and succeeds, it will be server 2 which will be put into the > table > so that next connections will go there as well. > > In your workload, there isn't even the time to validate the connection to > the > server, and *this* is what causes the problem you're seeing. > > Thank you for explaining what I'm seeing. This makes a lot of sense. > > Since changing to load balancing on the url_param our issue has been > > resolved. > > So indeed you're facing the type of workloads requiring a hash. > > > > Also, most people prefer not to apply stickiness for static objects so > that > > > they can be retrieved in parallel from all static servers instead of > all > > > hammering the same server. It might possibly not be your case based on > your > > > explanation, but this is what people usually do for a better user > > > experience. > > > > > > > > The objects aren't static. When they're loaded the application makes > some > > calls to external services (3rd party application, database server) to > > produce the desired objects and links. > > OK I see. Then better stick to the hash using url_param. You can improve > this by combining it with stick anyway if your url_params are frequently > reused (eg: many requests per client). This will avoid redistributing > innocent connections in the event a server is added or removed due to > the hash being recomputed. That can be especially true of your 3rd party > application sometimes has long response times and the probability of a > server outage between the first and the last request for a client becomes > high. > > Thank you for pointing this out, we hadn't considered this scenario. > > > In conclusion, your expected use case still seem quite obscure to me > :-/ > > > > > > Willy > > > > > > > I agree, our use case is fairly unique. > > It looks so :-) > > Willy > Thanks for taking the time to read and respond. It was very informative and helpful.
Re: haproxy
> On May 11, 2017, at May 11, 7:51 AM, Jose Alarcon > wrote: > > Hello, > > excuseme my english is very bad, i need know how change configuration haproxy > pasive/active manually not using keepalived. > There is no standard way because that is not a feature of haproxy. High availability of the proxy is managed by an external tool like keepalived. -Bryan > i need this information for a highscholhomework. > > thanks. > > my native lenguaje is spanish.-
Secondary load balancing method (fallback)
Is it possible to configure a secondary load balancing method, something to fall back on if the first method isn't met? For example, if I balance on the url_param email: balance url_param email Can it instead balance on another url_param: balance url_param id Or have it balance based on source address? I tried setting the following: balance url_param email balance url_param id But it only balanced on the second one, id. I haven't found anything saying this is possible, but I'd just like to make sure it isn't. Thanks.
Re: OpenSSL engine and async support
> On May 10, 2017, at 04:51, Emeric Brun wrote: > >> It looks like the main process stalls at DH_free(local_dh_1024) (part of >> __ssl_sock_deinit). Not sure why but I will debug and report back. >> >> Thanks, > > I experienced the same issue (stalled on a futex) if i run haproxy in > foreground and trying to kill it with kill -USR1. > > With this conf (dh param and ssl-async are disabled) > global > # tune.ssl.default-dh-param 2048 >ssl-engine qat > # ssl-async >nbproc 1 It looks like that the stall on futex issue is related to DH_free() calling ENGINE_finish in openssl 1.1: https://github.com/openssl/openssl/blob/master/crypto/dh/dh_lib.c#L109 (gdb) bt #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132 #1 0x7fa1582c5571 in pthread_rwlock_wrlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_wrlock.S:118 #2 0x7fa158a58559 in CRYPTO_THREAD_write_lock () from /tmp/openssl_1.1.0_install/lib/libcrypto.so.1.1 #3 0x7fa1589d8800 in ENGINE_finish () from /tmp/openssl_1.1.0_install/lib/libcrypto.so.1.1 #4 0x7fa158975e76 in DH_free () from /tmp/openssl_1.1.0_install/lib/libcrypto.so.1.1 #5 0x00417c78 in free_dh () at src/ssl_sock.c:7905 #6 0x7fa1591d58ce in _dl_fini () at dl-fini.c:254 #7 0x7fa158512511 in __run_exit_handlers (status=0, listp=0x7fa15888e688, run_list_atexit=true) at exit.c:78 #8 0x7fa158512595 in __GI_exit (status=) at exit.c:100 #9 0x00408814 in main (argc=4, argv=0x7ffe72188548) at src/haproxy.c:2235 Openssl 1.1 has changed the way ENGINE_cleanup works: https://www.openssl.org/docs/man1.1.0/crypto/ENGINE_cleanup.html "From OpenSSL 1.1.0 it is no longer necessary to explicitly call ENGINE_cleanup and this function is deprecated. Cleanup automatically takes place at program exit." I suppose by the time the destructor __ssl_sock_deinit is called, engine-related cleanup are already done by openssl and ENGINE_finish (from DH_free) stalls on a non-existing write lock. I have a workaround which moves the DH_free logic out of the destructor __ssl_sock_deinit, and right before process exit. With the workaround I no longer see the stall issue. I am not sure whether it is optimal solution though. Let me know. Thanks, Grant
Re: Secondary load balancing method (fallback)
On Fri, May 12, 2017 at 02:00:24PM -0700, redundantl y wrote: > Is it possible to configure a secondary load balancing method, something to > fall back on if the first method isn't met? > > For example, if I balance on the url_param email: > > balance url_param email > > Can it instead balance on another url_param: > > balance url_param id > > Or have it balance based on source address? > > I tried setting the following: > > balance url_param email > balance url_param id > > But it only balanced on the second one, id. > > I haven't found anything saying this is possible, but I'd just like to make > sure it isn't. You can't do this, and there is already a fallback on algorithms involving hashes, but the fallback is to round robin, as indicated in the doc. What you can do however is to check in your frontend if you have this parameter and use a specific backend when it is present, or another one when it is not present. Eg: frontend blah use_backend lb-email if { url_param(email) -m found } use_backend lb-id if { url_param(id) -m found } default_backend lb-src backend lb-email balance url_param email server s1 1.1.1.1 track lb-src/s1 server s2 1.1.1.2 track lb-src/s2 backend lb-id balance url_param id server s1 1.1.1.1 track lb-src/s1 server s2 1.1.1.2 track lb-src/s2 backend lb-src balance src server s1 1.1.1.1 check server s2 1.1.1.2 check Willy
Re: hostname to IP converter possible?
Hi Igor, On Sat, May 13, 2017 at 12:58:19AM +0800, Igor Pav wrote: > Hi list, > > Is now there's a converter for hostname to IPv4 available in haproxy? Funny that you asked the same question one year ago, but you didn't get a response, you are patient :-) Server addresses can be dynamically resolved now, but that's all. Baptiste is improving the DNS infrastructure so that it depends less on the servers, and maybe in the future it might become possible to use it for other features (eg: a converter) but for now it is not. Regards, Willy
Re: Failed to compile haproxy with lua on Solaris 10
Le 12/05/2017 à 15:54, Willy Tarreau a écrit : > Hi Benoît, > > On Thu, May 04, 2017 at 08:50:33AM +0200, Benoît GARNIER wrote: > (...) >> If you do the following operation : time_t => localtime() => struct tm >> => timegm() => time_t, your result will be shift by the timezone time >> offset (but without any DST applied). >> >> Technically, if you live in Great Britain, the operation will succeed >> during winter (but will offset the result by 1 hour during summer, since >> DST is applied here). > So in short you're saying that we should always use mktime() instead ? > > Willy No, not at all !!! To sum up, these are the basic functions to work with time : - time() return a time_t which is timezone agnostic since it's just a precise moment in time (it represents the same moment for everybody) - localtime() takes this time_t and build a representation of this time in the current time zone (struct tm) - mktime() take a struct tm representing a specific time in the current timezone et return a time_t gmtime() and timegm() are the same as localtime() and mktime() but will ignore the timezone and DST: they only work with UTC time. So you can use timegm() on a struct tm only if you know that struct tm represents a GMT time (for example if it was build with gmtime()). Similarly, using mktime() is only valid if this struct tm represents the time in the current time zone (i.e. if it was build with localtime() with the same timezone). For example if you parse a log file with GMT time in it you'll use timegm() to build a time_t representing the precise time of the log. If you parse a file with local time in it, you'll use mktime() but you'll also have to know what was the timezone used to build it. 1) Time zone agnostic: time() 2) Current time zone: localtime() and mktime() 3) UTC time: gmtime() and timegm() As a rule of thumb, you cannot mix functions in categories 2 and 3. Benoît
Re: hostname to IP converter possible?
Thanks, Willy. I found DNS infrastructure improved a lot this year, so I ask it again, hope it is not so stupid :-) On Sat, May 13, 2017 at 7:19 AM, Willy Tarreau wrote: > Hi Igor, > > On Sat, May 13, 2017 at 12:58:19AM +0800, Igor Pav wrote: >> Hi list, >> >> Is now there's a converter for hostname to IPv4 available in haproxy? > > Funny that you asked the same question one year ago, but you didn't get > a response, you are patient :-) > > Server addresses can be dynamically resolved now, but that's all. Baptiste > is improving the DNS infrastructure so that it depends less on the servers, > and maybe in the future it might become possible to use it for other features > (eg: a converter) but for now it is not. > > Regards, > Willy
Re: Failed to compile haproxy with lua on Solaris 10
Hi Benoît, On Sat, May 13, 2017 at 07:32:10AM +0200, Benoît GARNIER wrote: > Le 12/05/2017 à 15:54, Willy Tarreau a écrit : > > Hi Benoît, > > > > On Thu, May 04, 2017 at 08:50:33AM +0200, Benoît GARNIER wrote: > > (...) > >> If you do the following operation : time_t => localtime() => struct tm > >> => timegm() => time_t, your result will be shift by the timezone time > >> offset (but without any DST applied). > >> > >> Technically, if you live in Great Britain, the operation will succeed > >> during winter (but will offset the result by 1 hour during summer, since > >> DST is applied here). > > So in short you're saying that we should always use mktime() instead ? > > > > Willy > > No, not at all !!! To sum up, these are the basic functions to work with > time : > > - time() return a time_t which is timezone agnostic since it's just a > precise moment in time (it represents the same moment for everybody) > > - localtime() takes this time_t and build a representation of this time > in the current time zone (struct tm) > > - mktime() take a struct tm representing a specific time in the current > timezone et return a time_t > > gmtime() and timegm() are the same as localtime() and mktime() but will > ignore the timezone and DST: they only work with UTC time. > > So you can use timegm() on a struct tm only if you know that struct tm > represents a GMT time (for example if it was build with gmtime()). > > Similarly, using mktime() is only valid if this struct tm represents the > time in the current time zone (i.e. if it was build with localtime() > with the same timezone). > > For example if you parse a log file with GMT time in it you'll use > timegm() to build a time_t representing the precise time of the log. > > If you parse a file with local time in it, you'll use mktime() but > you'll also have to know what was the timezone used to build it. > > 1) Time zone agnostic: time() > 2) Current time zone: localtime() and mktime() > 3) UTC time: gmtime() and timegm() > > As a rule of thumb, you cannot mix functions in categories 2 and 3. OK thank you, that's perfectly clear now and it makes sense. Thierry told me that he purposely used timegm() because he wanted UTC. Man pages recommend not to use it because it's obsolete and suggest to use setenv(TZ)+tzset()+mktime() !!! It's amazing to read something that stupid in man pages written 10 years after everyone started to write threaded applications! I found that in practice many people use a hand-written timegm() function which does all the computation by hand, just like those of us who have known MS-DOS used to do 30 years ago, so I think we'll have to go down that route for a more portable implementation :-/ At least this will also save us from accidently using implementations where timegm() is a wrapper on setenv+tzset+mktime()... Thanks for your explanations! Willy
Re: Failed to compile haproxy with lua on Solaris 10
Le 13/05/2017 à 08:09, Willy Tarreau a écrit : > Hi Benoît, > > On Sat, May 13, 2017 at 07:32:10AM +0200, Benoît GARNIER wrote: >> Le 12/05/2017 à 15:54, Willy Tarreau a écrit : >>> Hi Benoît, >>> >>> On Thu, May 04, 2017 at 08:50:33AM +0200, Benoît GARNIER wrote: >>> (...) If you do the following operation : time_t => localtime() => struct tm => timegm() => time_t, your result will be shift by the timezone time offset (but without any DST applied). Technically, if you live in Great Britain, the operation will succeed during winter (but will offset the result by 1 hour during summer, since DST is applied here). >>> So in short you're saying that we should always use mktime() instead ? >>> >>> Willy >> No, not at all !!! To sum up, these are the basic functions to work with >> time : >> >> - time() return a time_t which is timezone agnostic since it's just a >> precise moment in time (it represents the same moment for everybody) >> >> - localtime() takes this time_t and build a representation of this time >> in the current time zone (struct tm) >> >> - mktime() take a struct tm representing a specific time in the current >> timezone et return a time_t >> >> gmtime() and timegm() are the same as localtime() and mktime() but will >> ignore the timezone and DST: they only work with UTC time. >> >> So you can use timegm() on a struct tm only if you know that struct tm >> represents a GMT time (for example if it was build with gmtime()). >> >> Similarly, using mktime() is only valid if this struct tm represents the >> time in the current time zone (i.e. if it was build with localtime() >> with the same timezone). >> >> For example if you parse a log file with GMT time in it you'll use >> timegm() to build a time_t representing the precise time of the log. >> >> If you parse a file with local time in it, you'll use mktime() but >> you'll also have to know what was the timezone used to build it. >> >> 1) Time zone agnostic: time() >> 2) Current time zone: localtime() and mktime() >> 3) UTC time: gmtime() and timegm() >> >> As a rule of thumb, you cannot mix functions in categories 2 and 3. > OK thank you, that's perfectly clear now and it makes sense. Thierry > told me that he purposely used timegm() because he wanted UTC. Man > pages recommend not to use it because it's obsolete and suggest to > use setenv(TZ)+tzset()+mktime() !!! It's amazing to read something > that stupid in man pages written 10 years after everyone started to > write threaded applications! I found that in practice many people > use a hand-written timegm() function which does all the computation > by hand, just like those of us who have known MS-DOS used to do 30 > years ago, so I think we'll have to go down that route for a more > portable implementation :-/ At least this will also save us from > accidently using implementations where timegm() is a wrapper on > setenv+tzset+mktime()... Time handling is not easy. I hate to say that, but POSIX and glibc manage to make it even harder. Especially the timezone global handling which is not thread-safe as you pinpointed. Anyway, free conding a simple timegm() is not very hard, since you don't have to take any timezone into account, only leap years. But beware that real timegm() (and mktime()) perform some time normalization on tm_mday (day of the month) or tm_mon (month). For example, they will happily work with fake dates like "March 31st 2017", "February 59th 2017" or even "1st day of 18th month of 2016" and will convert them to "April 1st 2017" internally. It's very handy when you want to compute date in the future or in the past, you just add/substract values to the corresponding field (day or month) and let mktime() of timegm() do their magic trick. Benoît