Bug#906276: chrony: System startup is blocked for minutes long

2018-08-19 Thread Vincent Blut

Hey Paul,

On Sun, Aug 19, 2018 at 08:07:37AM +0200, Paul Gevers wrote:

Hi Vincent,

On 18-08-18 17:49, Vincent Blut wrote:

Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call
(without GRND_NONBLOCK) block if insufficient entropy is available. This
causes applications to hang.

Maybe this is the reason.


Absolutely Paul, this is the root cause of our issue. I pushed a fix¹ on
salsa (plus a few more things), that would be great if you could upload
that.


Do you know if this issue is going to appear in stretch as well? Is
CVE-2018-1108 going to be fixed there?


We should be safe. Stretch has been released with chrony 3.0 which does 
not use getrandom(2) nor does it try to call this system call through 
syscall(2). getrandom(2) usage appeared in chrony 3.2.



Build and uploading now.


Awesome, thanks a lot!


Paul


Have a good day,
Vincent


signature.asc
Description: PGP signature


Bug#906276: chrony: System startup is blocked for minutes long

2018-08-19 Thread Paul Gevers
Hi Vincent,

On 18-08-18 17:49, Vincent Blut wrote:
>> Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call
>> (without GRND_NONBLOCK) block if insufficient entropy is available. This
>> causes applications to hang.
>>
>> Maybe this is the reason.
> 
> Absolutely Paul, this is the root cause of our issue. I pushed a fix¹ on
> salsa (plus a few more things), that would be great if you could upload
> that.

Do you know if this issue is going to appear in stretch as well? Is
CVE-2018-1108 going to be fixed there?

Build and uploading now.

Paul



signature.asc
Description: OpenPGP digital signature


Bug#906276: chrony: System startup is blocked for minutes long

2018-08-18 Thread Vincent Blut

Control: tags -1 pending

Hello,

On Fri, Aug 17, 2018 at 08:03:41PM +0200, Paul Gevers wrote:

Hi Vincent,

On 16-08-18 16:52, Vincent Blut wrote:

I was aware of this issue but I refrained from backporting 7c5bd948bb7e
(util: fall back to reading /dev/urandom when getrandom() blocks) as
like you said, nobody sent me a bug report, neither publicly nor
privately, about this issue. However your bug report clearly shows that
this problem has to be taken seriously. I’ll work on this on the
weekend, hope that’s ok?


Quote I read today:

Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call
(without GRND_NONBLOCK) block if insufficient entropy is available. This
causes applications to hang.

Maybe this is the reason.


Absolutely Paul, this is the root cause of our issue. I pushed a fix¹ on 
salsa (plus a few more things), that would be great if you could upload 
that.



Paul


Thanks,
Vincent


¹ 
https://salsa.debian.org/debian/chrony/commit/c7b83da8d07cb0021b17502b520c04914a62b1af


signature.asc
Description: PGP signature


Bug#906276: chrony: System startup is blocked for minutes long

2018-08-17 Thread Gustavo Serra Scalet

On 16/08/2018 11:52, Vincent Blut wrote:
However your bug report clearly shows that this problem has to be taken 
seriously. I’ll work on this on the weekend, hope that’s ok?


Of course, no problems. Thanks!

--
Gustavo Serra Scalet



Bug#906276: chrony: System startup is blocked for minutes long

2018-08-17 Thread Paul Gevers
Hi Vincent,

On 16-08-18 16:52, Vincent Blut wrote:
> I was aware of this issue but I refrained from backporting 7c5bd948bb7e
> (util: fall back to reading /dev/urandom when getrandom() blocks) as
> like you said, nobody sent me a bug report, neither publicly nor
> privately, about this issue. However your bug report clearly shows that
> this problem has to be taken seriously. I’ll work on this on the
> weekend, hope that’s ok?

Quote I read today:

Linux 4.16 fixed CVE-2018-1108 by making the getrandom system call
(without GRND_NONBLOCK) block if insufficient entropy is available. This
causes applications to hang.

Maybe this is the reason.

Paul



signature.asc
Description: OpenPGP digital signature


Bug#906276: chrony: System startup is blocked for minutes long

2018-08-16 Thread Vincent Blut

Hi Gustavo,

On Thu, Aug 16, 2018 at 01:09:39PM +, Gustavo Scalet wrote:

Package: chrony
Version: 3.3-2
Severity: important
Tags: patch

Dear Maintainer,

When trying out buster using fai-cloud-image scripts on Google cloud I
noticed that system took around 180 seconds to boot rather than 15
seconds (stretch).


By the way, in the systemd world, the default timeout for starting a 
unit is 90s. So I assume that chrony just fail to start before enough 
entropy has been gathered‽


After investigating, I detected it was a lack of entropy early on 
system startup that caused chrony to be blocked when calling 
getrandom(). That is an issue being reported on different 
projects[1][2] but I didn't see anyone reporting it for chrony at the 
moment. (Maybe the lack of entropy was not spotted when using buster 
outside of cloud providers?)


Sure, generating entropy on a VM can be quite problematic. To prevent 
entropy starvation in guests, I make use of VirtIO RNG which is probably 
the reason why I’m not affected by this issue.



The upstream project is patched already[3], but there is no new release
for the moment. I contacted the maintainer[4] and there should be a new
release in the following month that would contain that fix[5]. I chose
to report this bug and provide a patch in order to avoid others facing
this issue which is not so trivial to understand what is going on.


I was aware of this issue but I refrained from backporting 7c5bd948bb7e 
(util: fall back to reading /dev/urandom when getrandom() blocks) as 
like you said, nobody sent me a bug report, neither publicly nor 
privately, about this issue. However your bug report clearly shows that 
this problem has to be taken seriously. I’ll work on this on the 
weekend, hope that’s ok?


Cheers,
Vincent


signature.asc
Description: PGP signature


Bug#906276: chrony: System startup is blocked for minutes long

2018-08-16 Thread Gustavo Scalet
Package: chrony
Version: 3.3-2
Severity: important
Tags: patch

Dear Maintainer,

When trying out buster using fai-cloud-image scripts on Google cloud I
noticed that system took around 180 seconds to boot rather than 15
seconds (stretch).

After investigating, I detected it was a lack of entropy early on system
startup that caused chrony to be blocked when calling getrandom(). That
is an issue being reported on different projects[1][2] but I didn't see
anyone reporting it for chrony at the moment. (Maybe the lack of entropy
was not spotted when using buster outside of cloud providers?)

The upstream project is patched already[3], but there is no new release
for the moment. I contacted the maintainer[4] and there should be a new
release in the following month that would contain that fix[5]. I chose
to report this bug and provide a patch in order to avoid others facing
this issue which is not so trivial to understand what is going on.

Also this kind of bug is lately being discussed by debian community[6]

[1] https://github.com/libressl-portable/portable/issues/274
[2] 
https://github.com/openbsd/src/commit/edb2eeb7da8494998d0073f8aaeb8478cee5e00b
[3] 
https://git.tuxfamily.org/chrony/chrony.git/commit/?id=7c5bd948bb7e21fa0ee22f29e97748b2d0360319
[4] https://www.mail-archive.com/chrony-dev@chrony.tuxfamily.org/msg01898.html
[5] https://www.mail-archive.com/chrony-dev@chrony.tuxfamily.org/msg01899.html
[6] https://lists.debian.org/debian-release/2018/05/msg00130.html

-- System Information:
Debian Release: buster/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 4.17.0-1-amd64 (SMP w/1 CPU core)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages chrony depends on:
ii  adduser  3.117
ii  iproute2 4.17.0-2
ii  libc62.27-5
ii  libcap2  1:2.25-1.2
ii  libedit2 3.1-20180525-1
ii  libnettle6   3.4-1
ii  libseccomp2  2.3.3-3
ii  lsb-base 9.20170808
ii  ucf  3.0038

chrony recommends no packages.

Versions of packages chrony suggests:
pn  dnsutils  

-- debconf information excluded
--- chrony-3.3.orig/util.c
+++ chrony-3.3/util.c
@@ -1224,7 +1224,7 @@ get_random_bytes_getrandom(char *buf, un
   if (disabled)   
 break;
 
-  if (getrandom(rand_buf, sizeof (rand_buf), 0) != sizeof (rand_buf)) {
+  if (getrandom(rand_buf, sizeof (rand_buf), GRND_NONBLOCK) != sizeof 
(rand_buf)) {
 disabled = 1;  
 break;  
   }   
--- chrony-3.3.orig/util.c
+++ chrony-3.3/util.c
@@ -1224,7 +1224,7 @@ get_random_bytes_getrandom(char *buf, un
   if (disabled)   
 break;
 
-  if (getrandom(rand_buf, sizeof (rand_buf), 0) != sizeof (rand_buf)) {
+  if (getrandom(rand_buf, sizeof (rand_buf), GRND_NONBLOCK) != sizeof 
(rand_buf)) {
 disabled = 1;  
 break;  
   }