> On 12 Jun 2016, at 07:11, Theodore Ts'o <ty...@mit.edu> wrote:
> 
> On Sat, Jun 11, 2016 at 05:46:29PM -0400, Donald Stufft wrote:
>> 
>> It was a RaspberryPI that ran a shell script on boot that called
>> ssh-keygen.  That shell script could have just as easily been a
>> Python script that called os.urandom via
>> https://github.com/sybrenstuvel/python-rsa instead of a shell script
>> that called ssh-keygen.
> 
> So I'm going to argue that the primary bug was in the how the systemd
> init scripts were configured.  In generally, creating keypairs at boot
> time is just a bad idea.  They should be created lazily, in a
> just-in-time paradigm.

Agreed. I hope that if there is only one thing every participant has learned 
from this (extremely painful for all concerned) discussion, it’s that doing 
anything that requires really good random numbers should be delayed as long as 
possible on all systems, and should absolutely not be done during the boot 
process on Linux. Don’t generate key pairs, don’t make TLS connections, just 
don’t perform any action that requires really good randomness at all.

> So some people will freak out when the keygen systemd unit hangs,
> blocking the boot --- and other people will freak out of the systemd
> unit doesn't hang, and you get predictable SSH keys --- and some wiser
> folks will be asking the question, why the *heck* is it not
> openssh/systemd's fault for trying to generate keys this early,
> instead of after the first time sshd needs host ssh keys?  If you wait
> until the first time the host ssh keys are needed, then the system is
> fully booted, so it's likely that the entropy will be collected -- and
> even if it isn't, networking will already be brought up, and the
> system will be in multi-user mode, so entropy will be collected very
> quickly.

As far as I know we still only have three programs that were encountering this 
problem: Debian’s autopkgtest (which patched with PYTHONHASHSEED=0), 
systemd-cron (which is moving from Python to Rust anyway), and cloud-init (not 
formally reported but mentioned to me by a third-party). It remains unclear to 
me why the systemd-cron service files can’t simply request to be delayed until 
the kernel CSPRNG is seeded: I guess systemd doesn’t have any way to express 
that constraint? Perhaps it should.

Of this set, only cloud-init worries me, and it worries me for the *opposite* 
reason that Guido and Larry are worried. Guido and Larry are worried that 
programs like cloud-init will be delayed by two minutes while they wait for 
entropy: that’s an understandable concern. I’m much more worried that programs 
like cloud-init may attempt to establish TLS connections or create keys during 
this two minute window, leaving them staring down the possibility of performing 
“secure” actions with insecure keys.

This is why I advocate, like Donald does, for having *some* tool in Python that 
allows Python programs to crash if they attempt to generate cryptographically 
secure random bytes on a system that is incapable of providing them (which, in 
practice, can only happen on Linux systems). I don’t care how it’s spelled, I 
just care that programs that want to use a properly-seeded CSPRNG can error out 
effectively when one is not available. That allows us to ensure that Python 
programs that want to do TLS or build key pairs correctly refuse to do so when 
used in this state, *and* that they provide a clearly debuggable reason for why 
they refused. That allows the savvy application developers that Ted talked 
about to make their own decisions about whether their rapid startup is 
sufficiently important to take the risk.

Cory


[0]: 
https://github.com/systemd-cron/systemd-cron/issues/43#issuecomment-160343989

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to