Hi Veiko,

[ CCing Baptiste at the same time ]

On Wed, Jun 14, 2017 at 02:59:16PM +0300, Veiko Kukk wrote:
> Possible regression in 1.6.12
> 
> I might have discovered a haproxy bug. It occurs when all of the following
> configuration conditions are satisfied:
> * haproxy version 1.6.12
> * multiple processes
> * resolvers section with more than one server configured (not even used
> anywhere)
> * haproxy is either reloaded or restarted
> * request is made against freshly reloaded/restarted haproxy or haproxy
> backend server health check is made. Both cases requests do not get
> response.
> 
> When accessing haproxy, requests time out. Backends will fail checks and are
> marked as down with timeout error. Happens with browsers, curl, wget. When
> downgrading to 1.6.11, timeouts don't happen.
> 
> How I tested:
> 1) reload haproxy with the minimal config below
> 2) then run: for i in {1..100}; do date --utc; echo $i; curl
> https://tsthost.tld/haproxy?stats -o /dev/null -s -m 50; done
> Wed 14 Jun 11:45:44 UTC 2017
> 1
> Wed 14 Jun 11:46:34 UTC 2017
> 2
> Wed 14 Jun 11:47:24 UTC 2017
> 3
> Wed 14 Jun 11:48:14 UTC 2017
> 4
> Wed 14 Jun 11:48:14 UTC 2017
> 5
> Wed 14 Jun 11:49:04 UTC 2017
> 6
> Wed 14 Jun 11:49:05 UTC 2017
> 7
> Wed 14 Jun 11:49:55 UTC 2017
> 8
> Wed 14 Jun 11:49:55 UTC 2017
> 9
> Wed 14 Jun 11:50:45 UTC 2017
> 10
> Wed 14 Jun 11:50:46 UTC 2017
> 11
> Wed 14 Jun 11:50:46 UTC 2017
> 12
> Wed 14 Jun 11:50:46 UTC 2017
> 
> When removing either multiprocess configuration or resolvers section, no
> requests time out.
> 
> Following is trimmed down minimal config:
> global
>   daemon
>   nbproc 3
>   maxconn 500
>   user haproxy
>   tune.ssl.default-dh-param 2048
>   ssl-default-bind-options no-sslv3 no-tls-tickets
>   ssl-default-bind-ciphers 
> AES128+EECDH:AES128+EDH:!ADH:!AECDH:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!3DES:!MD5:!PSK
>   ssl-default-server-options no-sslv3 no-tls-tickets
>   ssl-default-server-ciphers 
> AES128+EECDH:AES128+EDH:!ADH:!AECDH:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!3DES:!MD5:!PSK
>   stats socket /var/run/haproxy1.sock mode 600 process 1
>   stats socket /var/run/haproxy2.sock mode 600 process 2
>   stats socket /var/run/haproxy3.sock mode 600 process 3
> 
> defaults
>   bind-process 3
>   log /dev/log local0
>   option log-health-checks
>   option contstats
>   timeout connect 10s
>   timeout client 60s
>   timeout server 60s
> 
> resolvers dns_resolvers
>   # local caching named
>   nameserver dns0 127.0.0.1:53
>   # remote servers
>   nameserver dns1 8.8.8.8:53
>   nameserver dns2 8.8.4.4:53
> 
> listen ssl-frontend
>   bind-process 1-2
>   bind *:443 ssl crt /path/to/certificate.pem
>   server http-frontend 127.0.0.1:666 send-proxy check
> 
> frontend http-frontend
>   mode http
>   stats enable
>   option forwardfor
>   option httplog
>   bind *:80
>   bind 127.0.0.1:666 accept-proxy
> 
> backend ssl_backend
>   mode http
>   option httplog
>   server ssl_server google.com:443 check ssl verify none fall 2 inter 5s
> fastinter 3s rise 3
> 
> 
> HA-Proxy version 1.6.12 2017/04/04
> Copyright 2000-2017 Willy Tarreau <[email protected]>
> 
> Build options :
>   TARGET  = linux2628
>   CPU     = generic
>   CC      = gcc
>   CFLAGS  = -m64 -march=x86-64 -O2 -g -fno-strict-aliasing
> -Wdeclaration-after-statement
>   OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
> 
> Default settings :
>   maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
> 
> Encrypted password support via crypt(3): yes
> Built with zlib version : 1.2.3
> Running on zlib version : 1.2.7
> Compression algorithms supported : identity("identity"), deflate("deflate"),
> raw-deflate("deflate"), gzip("gzip")
> Built with OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> Running on OpenSSL version : OpenSSL 1.0.1e-fips 11 Feb 2013
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 7.8 2008-09-05
> Running on PCRE version : 7.8 2008-09-05
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with Lua version : Lua 5.3.3
> Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
> IP_FREEBIND
> 
> Available polling systems :
>       epoll : pref=300,  test result OK
>        poll : pref=200,  test result OK
>      select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.

Could you try to revert the attached patch which was backported to 1.6
to fix an issue where nbproc and resolvers were incompatible ? To do
that, please use "patch -Rp1 < foo.patch".

There was a real problem with this situation where a DNS response could
randomly be delivered to any process, leading to tons of DNS timeouts.
But in your config, your backend is bound to a single process and was
not subject to this issue. Thus I suspect that the fix triggered an
unexpected side effect which is yet to be determined.

Also, have you noticed if your haproxy continues to work or if it loops
at 100% CPU for example ? We've had such a report lately, and we could
imagine a relation given that in your case you don't get a response to
your request.

Thanks,
Willy

>From 8fb1a4649d722766219726093d5789f995921cfe Mon Sep 17 00:00:00 2001
From: Baptiste Assmann <[email protected]>
Date: Thu, 2 Feb 2017 23:14:51 +0100
Subject: BUG/MAJOR: dns: restart sockets after fork()

UDP sockets used to send DNS queries are created before fork happens and
this is a big problem because all the processes (in case of a
configuration starting multiple processes) share the same socket. Some
processes may consume responses dedicated to an other one, some servers
may be disabled, some IPs changed, etc...

As a workaround, this patch close the existing socket and create a new
one after the fork() has happened.

[wt: backport this to 1.7]
(cherry picked from commit 26c6eb838311c31db0002c7d3c93a81297012d44)
[wt: needed in 1.6 as well]
(cherry picked from commit eaf96d7a0849b2883e98459f52489d555b6b013c)
---
 src/haproxy.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/haproxy.c b/src/haproxy.c
index 8caffb6..2778819 100644
--- a/src/haproxy.c
+++ b/src/haproxy.c
@@ -1955,6 +1955,10 @@ int main(int argc, char **argv)
                fork_poller();
        }
 
+       /* initialize structures for name resolution */
+       if (!dns_init_resolvers(1))
+               exit(1);
+
        protocol_enable_all();
        /*
         * That's it : the central polling loop. Run until we stop.
-- 
1.7.12.1

Reply via email to