Bug#906276: chrony: System startup is blocked for minutes long
Hey Paul, On Sun, Aug 19, 2018 at 08:07:37AM +0200, Paul Gevers wrote: Hi Vincent, On 18-08-18 17:49, Vincent Blut wrote: Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call (without GRND_NONBLOCK) block if insufficient entropy is available. This causes applications to hang. Maybe this is the reason. Absolutely Paul, this is the root cause of our issue. I pushed a fix¹ on salsa (plus a few more things), that would be great if you could upload that. Do you know if this issue is going to appear in stretch as well? Is CVE-2018-1108 going to be fixed there? We should be safe. Stretch has been released with chrony 3.0 which does not use getrandom(2) nor does it try to call this system call through syscall(2). getrandom(2) usage appeared in chrony 3.2. Build and uploading now. Awesome, thanks a lot! Paul Have a good day, Vincent signature.asc Description: PGP signature
Bug#906276: chrony: System startup is blocked for minutes long
Hi Vincent, On 18-08-18 17:49, Vincent Blut wrote: >> Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call >> (without GRND_NONBLOCK) block if insufficient entropy is available. This >> causes applications to hang. >> >> Maybe this is the reason. > > Absolutely Paul, this is the root cause of our issue. I pushed a fix¹ on > salsa (plus a few more things), that would be great if you could upload > that. Do you know if this issue is going to appear in stretch as well? Is CVE-2018-1108 going to be fixed there? Build and uploading now. Paul signature.asc Description: OpenPGP digital signature
Bug#906276: chrony: System startup is blocked for minutes long
Control: tags -1 pending Hello, On Fri, Aug 17, 2018 at 08:03:41PM +0200, Paul Gevers wrote: Hi Vincent, On 16-08-18 16:52, Vincent Blut wrote: I was aware of this issue but I refrained from backporting 7c5bd948bb7e (util: fall back to reading /dev/urandom when getrandom() blocks) as like you said, nobody sent me a bug report, neither publicly nor privately, about this issue. However your bug report clearly shows that this problem has to be taken seriously. I’ll work on this on the weekend, hope that’s ok? Quote I read today: Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call (without GRND_NONBLOCK) block if insufficient entropy is available. This causes applications to hang. Maybe this is the reason. Absolutely Paul, this is the root cause of our issue. I pushed a fix¹ on salsa (plus a few more things), that would be great if you could upload that. Paul Thanks, Vincent ¹ https://salsa.debian.org/debian/chrony/commit/c7b83da8d07cb0021b17502b520c04914a62b1af signature.asc Description: PGP signature
Bug#906276: chrony: System startup is blocked for minutes long
On 16/08/2018 11:52, Vincent Blut wrote: However your bug report clearly shows that this problem has to be taken seriously. I’ll work on this on the weekend, hope that’s ok? Of course, no problems. Thanks! -- Gustavo Serra Scalet
Bug#906276: chrony: System startup is blocked for minutes long
Hi Vincent, On 16-08-18 16:52, Vincent Blut wrote: > I was aware of this issue but I refrained from backporting 7c5bd948bb7e > (util: fall back to reading /dev/urandom when getrandom() blocks) as > like you said, nobody sent me a bug report, neither publicly nor > privately, about this issue. However your bug report clearly shows that > this problem has to be taken seriously. I’ll work on this on the > weekend, hope that’s ok? Quote I read today: Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call (without GRND_NONBLOCK) block if insufficient entropy is available. This causes applications to hang. Maybe this is the reason. Paul signature.asc Description: OpenPGP digital signature
Bug#906276: chrony: System startup is blocked for minutes long
Hi Gustavo, On Thu, Aug 16, 2018 at 01:09:39PM +, Gustavo Scalet wrote: Package: chrony Version: 3.3-2 Severity: important Tags: patch Dear Maintainer, When trying out buster using fai-cloud-image scripts on Google cloud I noticed that system took around 180 seconds to boot rather than 15 seconds (stretch). By the way, in the systemd world, the default timeout for starting a unit is 90s. So I assume that chrony just fail to start before enough entropy has been gathered‽ After investigating, I detected it was a lack of entropy early on system startup that caused chrony to be blocked when calling getrandom(). That is an issue being reported on different projects[1][2] but I didn't see anyone reporting it for chrony at the moment. (Maybe the lack of entropy was not spotted when using buster outside of cloud providers?) Sure, generating entropy on a VM can be quite problematic. To prevent entropy starvation in guests, I make use of VirtIO RNG which is probably the reason why I’m not affected by this issue. The upstream project is patched already[3], but there is no new release for the moment. I contacted the maintainer[4] and there should be a new release in the following month that would contain that fix[5]. I chose to report this bug and provide a patch in order to avoid others facing this issue which is not so trivial to understand what is going on. I was aware of this issue but I refrained from backporting 7c5bd948bb7e (util: fall back to reading /dev/urandom when getrandom() blocks) as like you said, nobody sent me a bug report, neither publicly nor privately, about this issue. However your bug report clearly shows that this problem has to be taken seriously. I’ll work on this on the weekend, hope that’s ok? Cheers, Vincent signature.asc Description: PGP signature
Bug#906276: chrony: System startup is blocked for minutes long
Package: chrony Version: 3.3-2 Severity: important Tags: patch Dear Maintainer, When trying out buster using fai-cloud-image scripts on Google cloud I noticed that system took around 180 seconds to boot rather than 15 seconds (stretch). After investigating, I detected it was a lack of entropy early on system startup that caused chrony to be blocked when calling getrandom(). That is an issue being reported on different projects[1][2] but I didn't see anyone reporting it for chrony at the moment. (Maybe the lack of entropy was not spotted when using buster outside of cloud providers?) The upstream project is patched already[3], but there is no new release for the moment. I contacted the maintainer[4] and there should be a new release in the following month that would contain that fix[5]. I chose to report this bug and provide a patch in order to avoid others facing this issue which is not so trivial to understand what is going on. Also this kind of bug is lately being discussed by debian community[6] [1] https://github.com/libressl-portable/portable/issues/274 [2] https://github.com/openbsd/src/commit/edb2eeb7da8494998d0073f8aaeb8478cee5e00b [3] https://git.tuxfamily.org/chrony/chrony.git/commit/?id=7c5bd948bb7e21fa0ee22f29e97748b2d0360319 [4] https://www.mail-archive.com/chrony-dev@chrony.tuxfamily.org/msg01898.html [5] https://www.mail-archive.com/chrony-dev@chrony.tuxfamily.org/msg01899.html [6] https://lists.debian.org/debian-release/2018/05/msg00130.html -- System Information: Debian Release: buster/sid APT prefers testing APT policy: (500, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 4.17.0-1-amd64 (SMP w/1 CPU core) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages chrony depends on: ii adduser 3.117 ii iproute2 4.17.0-2 ii libc62.27-5 ii libcap2 1:2.25-1.2 ii libedit2 3.1-20180525-1 ii libnettle6 3.4-1 ii libseccomp2 2.3.3-3 ii lsb-base 9.20170808 ii ucf 3.0038 chrony recommends no packages. Versions of packages chrony suggests: pn dnsutils -- debconf information excluded --- chrony-3.3.orig/util.c +++ chrony-3.3/util.c @@ -1224,7 +1224,7 @@ get_random_bytes_getrandom(char *buf, un if (disabled) break; - if (getrandom(rand_buf, sizeof (rand_buf), 0) != sizeof (rand_buf)) { + if (getrandom(rand_buf, sizeof (rand_buf), GRND_NONBLOCK) != sizeof (rand_buf)) { disabled = 1; break; } --- chrony-3.3.orig/util.c +++ chrony-3.3/util.c @@ -1224,7 +1224,7 @@ get_random_bytes_getrandom(char *buf, un if (disabled) break; - if (getrandom(rand_buf, sizeof (rand_buf), 0) != sizeof (rand_buf)) { + if (getrandom(rand_buf, sizeof (rand_buf), GRND_NONBLOCK) != sizeof (rand_buf)) { disabled = 1; break; }