William Lallemand <wlallem...@haproxy.com> wrote on 02/05/2019 20:56:32:
> From: William Lallemand <wlallem...@haproxy.com> > To: Robert Allen1 <robert.all...@ibm.com> > Cc: haproxy@formilux.org > Date: 02/05/2019 20:56 > Subject: Re: leak of handle to /dev/urandom since 1.8? > > On Thu, May 02, 2019 at 03:34:22PM +0100, Robert Allen1 wrote: > > Hi, > > > > I spent some time digging into the FD leak and I've pinpointed it: it's an > > interaction between HAProxy re-exec and a change to keep the random > > devices open by default in OpenSSL 1.1.1 -- specifically > > https://urldefense.proofpoint.com/v2/url? > u=https-3A__github.com_openssl_openssl_commit_c7504aeb640a88949dfe3146f7e0f275f517464c&d=DwIBAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=mL0dJce__JV7umP9p1PfZytbXkr-VrmKaDyuaum4y- > M&m=VlTkwp0lruBUlXxGhxs-3CGI2Q019JikpqAlU4Lv8SU&s=2so7ELjkXftWUz9tSJgJer- > eibQJsr5jR0cgq7NV5Ns&e= > > > > Since a lot of different RAND_* family functions cause initialisation, > > that explains why hacking out ssl_initialize_random didn't work for us; > > some other part of reloading just ended up opening the random devices and > > then leaking the handles on the next exec(). > > > > Anyway, from experimentation, we can close the leak with a patch like (but > > obviously better abstracted than) this one against 1.9.6: > > > > diff --git a/src/haproxy.c b/src/haproxy.c > > index 1cb10391..d3482f46 100644 > > --- a/src/haproxy.c > > +++ b/src/haproxy.c > > @@ -737,6 +737,10 @@ void mworker_reload() > > ptdf->fct(); > > if (fdtab) > > deinit_pollers(); > > + /* Close OpenSSL random devices, if open */ > > + /* Note: If not already initialised, this may cause OpenSSL > > rand_lib initialisation... */ > > + void RAND_keep_random_devices_open(int keep); > > + RAND_keep_random_devices_open(0); > > /* restore the initial FD limits */ > > limit.rlim_cur = rlim_fd_cur_at_boot; > > > > From reading in the OpenSSL code, I assume that the use of O_RDONLY > > without O_CLOEXEC or a fcntl is intentional so that it doesn't have to > > re-open the devices on fork(). > > > > So it looks like the best place to solve this is around there ^^, although > > it needs to be wrapped up in ssl_sock.c with appropriate version guards on > > the OpenSSL version. > > > > Any thoughts on the above and how to proceed? > > > > Rob > > > > Hi Robert, > > I think that the right thing to do, we should probably just call > RAND_keep_random_devices_open(0); in the mworker_reload function. > > You just need to check the SSL macros for the build and the minimum openssl > version (1.1.1), so probably something like this: > > #if defined(USE_OPENSSL) && (OPENSSL_VERSION_NUMBER >= 0x10101000L) > RAND_keep_random_devices_open(0); > #endif > > I don't think we need anything in ssl_sock.c though. > > Regards, > > -- > William Lallemand > Hi William, Thanks for the your input. I've included a patch below against current master that I hope conforms to the contribution guidelines well enough. :) A couple of thoughts on my work: * Having to include a file directly from OpenSSL seems unfortunate, but OK in the context of the preprocessor guard * The comment is perhaps redundant, but I don't think the side effect of the OpenSSL function is obvious from its name otherwise * My reading of RAND_keep_random_devices_open is that it expects OpenSSL rand_lib initialisation to have occurred already, and it will do it if not. So it seems possible that this function call could incur some delays if rand_lib is not yet initialised and the entropy sources cause delay, etc. However, I don't know how big a concern that is. Any thoughts? Rob >From 7f432956cd7a11837c2657944b5e037c510645c7 Mon Sep 17 00:00:00 2001 From: Rob Allen <robert.all...@ibm.com> Date: Fri, 3 May 2019 09:11:32 +0100 Subject: [PATCH] BUG/MINOR: mworker: close OpenSSL FDs on reload >From OpenSSL 1.1.1, the default behaviour is to maintain open FDs to any random devices that get used by the random number library. As a result, those FDs leak when the master re-execs on reload; since those FDs are not marked FD_CLOEXEC or O_CLOEXEC, they also get inherited by children. Eventually both master and children run out of FDs. OpenSSL 1.1.1 introduces a new function to control whether the random devices are kept open. When clearing the keep-open flag, it also closes any currently open FDs, so it can be used to clean-up open FDs too. Therefore, a call to this function is made in mworker_reload prior to re-exec. --- src/haproxy.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/haproxy.c b/src/haproxy.c index 603f084c..cc689b62 100644 --- a/src/haproxy.c +++ b/src/haproxy.c @@ -127,6 +127,7 @@ #include <proto/vars.h> #ifdef USE_OPENSSL #include <proto/ssl_sock.h> +#include <openssl/rand.h> #endif /* array of init calls for older platforms */ @@ -587,6 +588,10 @@ void mworker_reload() ptdf->fct(); if (fdtab) deinit_pollers(); +#if defined(USE_OPENSSL) && (OPENSSL_VERSION_NUMBER >= 0x10101000L) + /* close random device FDs */ + RAND_keep_random_devices_open(0); +#endif /* restore the initial FD limits */ limit.rlim_cur = rlim_fd_cur_at_boot; -- 2.20.1 (Apple Git-117) Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU