Hi Marius, Marius Bakke <mba...@fastmail.com> skribis:
> Ludovic Courtès <l...@gnu.org> writes: [...] >> I read some of these, and our ‘urandom-seed-service-type’ has the same >> bug as <https://github.com/systemd/systemd/issues/4271>. Namely, we >> write the previous seed to /dev/urandom but we don’t credit the >> entropy. >> >> The attached patch fixes that, and I think it should fix the problem you >> reported. Could people give it a try? > > Good catch, LGTM. Unfortunately it does not fix the problem. > >> I’m interested in seeing the value of >> /proc/sys/kernel/random/entropy_avail with and without this patch right >> after boot (don’t try it in ‘guix system vm’ because there’s no seed >> there.) > > before - 243 > after - 2419 Is that with or without sshd running? Do we have strong evidence that sshd is stuck in getrandom(2)? That seems weird to me because we use #:pid-file for sshd, and thus either sshd produces its PID file and we’re done (‘ssh-daemon’ is considered started and life goes on), or sshd fails to produce its PID file within 15 seconds and we kill it and consider that ‘ssh-daemon’ failed to start. This only way this can hang is if sshd calls getrandom(2) before daemonizing. Looking at ‘main’ in sshd.c, I see: seed_rng(); […] already_daemon = daemonized(); which I think means sshd indeed calls getrandom(2) before it has forked. That explains the situation. :-/ (‘seed_rng’ uses ‘RAND_status’ from OpenSSL, which supports several methods but presumably defaults to getrandom(2)?) > I don't know why this change was insufficient. Perhaps the kernel > does not consider such a seed alone trustworthy enough? I also tried to > increase the seed size to no avail. Can you try to do: (add-to-entropy-count urandom (expt 2 17)) to see if that changes anything at all? I checked with strace and the RNDADDTOENTCNT binding seems to be passing its argument as expected. > I found this patch in the 5.4 kernel tree after reading the commit log > of random.c: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f2dc2798b81531fd93a3b9b7c39da47ec689e55 > > ...which *does* solve the problem. > > The comments in the merge commit suggests that it is not necessarily a > good solution, so I think we should let it "settle" a bit upstream > before pushing it. It does look rather sledgehammer-y... > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f2dc2798b81531fd93a3b9b7c39da47ec689e55 > > Thoughts? If it has to be that way, we can use this patch and we can always remove it later if we have a better solution. At any rate, I’d rather not block ‘core-updates’ any longer. Thoughts? Thanks for investigating! Ludo’.