Re: [PATCH] getrandom: new applet

Rob Landley Tue, 05 Jul 2016 19:34:22 -0700

On 07/05/2016 06:42 AM, Etienne Champetier wrote:
>> You've implied that this new API can block until it's initialized, which
>> reading from /dev/random can already do, and presumably
>> select/poll/inotify could do on /dev/random without consuming entropy.
> 
> As shown in my mail from 29 june 2016 at 15:54
> [   14.321536] ### getrandom ###
> [   42.603677] ### dd ###
> 
> getrandom() wait until /dev/urandom is initialized, which can be way
> before /dev/random is initialized


Ok, this is a new assertion. How does that work? How would /dev/urandom
get random data to seed it without /dev/random having enough entropy for
one byte of output?

>> I gave a trivial command line example allowing an init script to block
>> waiting for the entropy pool to be initialized, with a timeout that
>> continues on after a reasonable time period if it isn't.
> 
> I don't want to timeout, I want to wait until /dev/urandom is
> initialized to be "crypto safe"

Then don't use "timeout", although on a headless embedded system you're
potentially waiting _forever_. Do a blocking single byte read from
/dev/random.

>> Yes it read one
>> byte from the entropy pool because it was off the top of my head and I
>> didn't want to look at the inotifyd --help output and experiment to make
>> sure that's wired up
> 
> You don't want inotify but select() or poll(), is there such utility
> in busybox? (real question, haven't found)

Hmmm, not seeing one.

>> because in the past 15 years it's never been a
>> _problem_ for me.
> 
> See Bastian mail again, this can be a problem for other

I've read his mail. I'm aware of the problem he refers to. I don't see
how your solution to it is better than the existing solution to it.

>> You've _also_ implied that this new API can provide "good enough" random
>> information before the random pool is initialized,
> 
> To be sure we are on the same page here, getrandom() provides
> cryptographically secure random numbers
> from fully initialized /dev/urandom pool, before /dev/random pool is
> initialized, but they are still cryptographically secure.

How? /dev/urandom and/dev/random pull from the same pool. If it's
ignoring the entropy accounting, your definition of "cryptographically
secure" differs from the conventional one.

> If you think /dev/random is more secure than getrandom() (ie
> /dev/urandom properly initialized),
> here is my mail from 30 june 2016 at 10:12:
> 
> a good read
> http://www.2uo.de/myths-about-urandom/
> 
> Also
> https://en.wikipedia.org/wiki//dev/random
> 
> Is there any serious argument that adding new entropy all the time is
> a good thing?

Apparently, you. Everybody else agrees it's a good thing.

> The Linux /dev/urandom manual page claims that without
> new entropy the user is "theoretically vulnerable to a cryptographic
> attack",[16] but (as I've mentioned in various venues) this is a
> ludicrous argument—how can anyone simultaneously believe that
> 
> we can't figure out how to deterministically expand one 256-bit secret
> into an endless stream of unpredictable keys (this is what we need
> from urandom), but
> we can figure out how to use a single key to safely encrypt many
> messages (this is what we need from SSL, PGP, etc.)?

There are a number of people out there who believe the various black
budget agencies _can_ crack PGP, but that's a separate argument. :)

These days people don't use ssl, they use TLS to encrypt things:

  https://en.wikipedia.org/wiki/POODLE

The reason? People studied the old algorithms long enough to take them
apart and make it insecure. There's a drinking game for this from the
lady who did this "lifetime of crypto hash functions" table:

  http://valerieaurora.org/hash.html

The theory is if you use one algorithm to generate persistent keys for
another algorithm, you're stacking potential vulnerabilities.

Any time you're using a rotating hash, people can look at multiple
outputs to get more info about the state of the secret stuff. A "known
plaintext attack" is a common way of breaking encryption algorithms
where you get a copy of both the encrypted and the unencrypted data and
use that to _hugely_ speed up the amount of time it takes to guess the
key. In your case, your known plaintext is basically all zeroes because
you're not mixing any message in with it! Yes, used in that context any
algorithm will be MUCH weaker than using it to encrypt things like
tarballs. (And compressing plaintext before encrypting it is fairly
common because you want the entropy of the message up because "high bit
is always zero" is a _little_ bit known plaintext. :)

Also, a problem with cryptographic screwups is it's really hard for
people to NOTICE them:

  https://lists.debian.org/debian-security-announce/2008/msg00152.html


https://threatpost.com/audit-of-github-ssh-keys-finds-many-still-vulnerable-to-old-debian-bug/113117/

Pretty much by definition crypto exploits are something the developer
didn't think of, so cryptographers tend to be paranoid.

>> which is what
>> /dev/urandom is for. You've implied that your new function call is
>> somehow superior to /dev/urandom because you're magically generating
>> entropy when the entropy pool is empty, in userspace, somehow.
> 
> no no and no, this syscall block until /dev/urandom is initialized
> with 128bits of entropy

dd if=/dev/random of=/dev/random bs=16 count=1

There, you've blocked until there were 128 bits of entropy and you
haven't actually consumed any entropy because you fed it right back.

If the entropy pool contains 128 bits but /dev/random won't give it to
userspace, file a bug with the kernel guys.

> good luck bruteforcing 2^128. Please read my 2 previous links again,
> and google for "myth about /dev/random".

Brute forcing symmetric keys is different from breaking algorithmic
output. If you don't understand that much, I'm not very interested in
your opinions on crypto.

>> (This is
>> probably what set people going "you can cat /proc/stuff to /dev/random
>> to feed crap into it and make it think it's got entropy which is at
>> least machine-local". Ways of dealing with an empty entropy pool.)
> 
> cat into /dev/random mix what you give it but doesn't change the
> entropy estimation

You're talking about reading from /dev/urandom which isn't going to
block based on entropy accounting.

>> You have NOT explained what the point of your new interface is, what it
>> does that the existing one CAN'T do,
> 
> get random bytes from /dev/urandom once it's properly seeded,
> and don't wait another 30s for /dev/random to unblock

If /dev/urandom is "properly seeded" but /dev/random won't give you 128
bits of data when you read that much from it, file a kernel bug.

>> or why it demands a new command line utility.
> 
> getrandom() is a syscall and not a /proc file because they (kernel
> maintainers) wanted to fix another problem,
> file descriptor exhaustion.
>
> They didn't change /dev/urandom to block
> because this would have break userspace
> Please read https://lwn.net/Articles/605828/

I did, back when it came out.

That version is full of typos, you want
http://man7.org/linux/man-pages/man2/getrandom.2.html these days. But
let's cut and paste from the page you linked:

> The getrandom(2) system call was requested by the LibreSSL Portable
> developers.  It is analoguous to the getentropy(2) system call in
> OpenBSD.
> 
> The rationale of this system call is to provide resiliance against
> file descriptor exhaustion attacks, where the attacker consumes all
> available file descriptors, forcing the use of the fallback code where
> /dev/[u]random is not available.  Since the fallback code is often not
> well-tested, it is better to eliminate this potential failure mode
> entirely.

So to save a file descriptor, you're making an applet.

When you launch a new process, it's got stdin/stdout/stderr plus a
hidden file descriptor for its executable mmap. So if the file
descriptor table is exhausted, how exactly do you launch a new
executable? (in kernel/fork.c copy_files() is calling dup_fd() in
fs/file.c which sets error to -ENOMEM if it can't do the thing...)

The rationale you just linked to CANNOT APPLY to the context you're
talking about here. (Unless you mean ulimit exhaustion but not global
system fd exhaustion...?)

The rest of the stuff about blocking until you have enough entropy
doesn't say that reading one byte from /dev/random would return later
than getrandom(buf, 1, 0); (If you're waiting for 128 bits and you've
got 128 bits, why is reading from /dev/random different than this
function call?)

> This can be an applet or a new var in busybox shell ($GETRANDOM ?),
> but i can't right now from the shell access getrandom() or a similar
> behaviour

Let's look at what ash actually does, shall we?

//config:config ASH_RANDOM_SUPPORT
//config:  bool "Pseudorandom generator and $RANDOM variable"
//config:  default y
//config:  depends on ASH
//config:  help
//config:    Enable pseudorandom generator and dynamic variable "$RANDOM".
//config:    Each read of "$RANDOM" will generate a new pseudorandom value.
//config:    You can reset the generator by using a specified start value.
//config:    After "unset RANDOM" the generator will switch off and this
//config:    variable will no longer have special treatment.

It doesn't look at /dev/urandom at all,  instead it has a builtin
pseudo-random numer generator that isn't affected by this at all.

Sigh.

>> You haven't even justified the lower bar of modifying
>> existing C users (such as the shell's $RANDOM) to use the new API
>> instead.

Modifying busybox ash to use getrandom() instead of the nonsense it's
currently doing would TOTALLY be a good idea. Ick.

That doesn't mean your new proposed applet has a purpose.

> mail from 28 june 2016 at 23:56:
> getrandom() is blocking until properly initialized,
> i don't think we want such API change for $RANDOM

It's Bernhard's call, not mine. I'm busy with
http://lists.landley.net/pipermail/toybox-landley.net/2016-June/008484.html
these days...

>> What is the POINT of your requested change?
> 
> be able to read from /dev/urandom as soon as it is safe, not 30sec
> later (waiting for /dev/random)

You have no idea how long it will block. If there are no entropy
sources, it could block indefinitely. Your "enough entropy for the
/dev/urandom pool to be good enough" and "enough entropy for /dev/random
to give you 128 bits to write back into it" metrics are not defined by
anything you've posted.

>> Your objection here seems to be that the unix "everything is a file"
>> interface is inefficient. That Plan 9 had the wrong idea and we should
>> all be using ioclts the way windows intended.
> 
> no, not at all, kernel dev made it a syscall to prevent file
> descriptor exhaustion,

Because openbsd was written in a very specific way that had legacy problems.

> i would have prefer to also have a /proc/getrandom file

For years I've wanted to split mount -t proc into procfs, sysfs, and
crapfs. Unfortunately, procfs and crapfs need union mounts to coexist.
(Which we _sort_ of have now, I should revisit it...)

Please do not add more crap to crapfs. (We have /sys to add crap to now.)

>> A similar move over in the ps space is https://lwn.net/Articles/633622/
>> but they aren't proposing providing a "newps" command with a different
>> command line interface because API. They aren't expecting _users_ (such
>> as $RANDOM in the shell) to make this difference visible to end users.
>>
>> Why are you doing so? You may _have_ a reason, but you have yet to
>> communicate it.
> 
> My goal is to read from /dev/urandom once it's properly seeded,
> if someone find me a solution in less than 207K i will take it, but I
> myself haven't found one
> (not a joke, i'm really open and ready to recognize a better solution)
> 
> We can't make $RANDOM block else it will break many system
> Also we don't expect $RANDOM to be cryptographically secure, so no
> need to slow it down

Sigh, let's actually try this...

Oh hey,  Ubuntu 14.04 doesn't have SYS_getrandom defined in syscalls.h
so your new applet won't build on that. Good to know.

Let's hardwire in the x86-64 syscall number...

#include <stdio.h>
#include <sys/syscall.h>

int main(int argc, char *argv)
{
  char buf[1];

  printf("%d\n", syscall(318, buf, 1, 0));
  printf("%d\n", *buf);
}

Ok, works on the host and gives me 1 byte. Now let's scp the static
binary into a freshly booted aboriginal linux image under qemu and:

Freeing unused kernel memory: 536K (ffffffff813c1000 - ffffffff81447000)
Aboriginal Linux 1.4.2
mount: '/dev/hdc'->'/mnt': No medium found
Type exit when done.
(x86_64:1) / $ wget http://landley.net/a.out
Connecting to landley.net (208.113.171.142:80)
a.out                100% |*******************************| 24005
0:00:00 ETA
(x86_64:1) / $ chmod +x a.out
(x86_64:1) / $ time ./a.out
1
-64
real 164.844162
user 0.012000
sys 0.004000
(x86_64:1) / $

It blocked for 2 minutes and 40 seconds waiting for the random pool to
initialize. So the answer to whether it's faster than /dev/random
initializing is at the very least "not significantly".

>>> Maybe kernel maintainers have good reasons to not mix them ...
>>
>> This is called an "appeal to authority", and you're doing it really
>> badly. You haven't got an argument so you say 4 out of 5 doctors prefer
>> your brand of gum, or it was endorsed by a celebrity spokesmodel. That
>> is not a REASON.
> 
> If you have some time, there is a 2 months discussion on LKML to
> rework /dev/random,
> /dev/urandom, with 3 differents code proposals (and multiple versions)

I wander in and out of those, but the point of reworking /dev/random is
that /dev/random then works better.

> I can't sum it up (and i can't find the third one right now)
> https://lwn.net/Articles/684568/
> https://lwn.net/Articles/686398/

I do read lwn.net, thanks.

>> The kernel maintainers added a vdso page with nonblocking userspace
>> gettimeofday() because it was lower latency than the system call. They
>> did not change the libc API for it, nor did anyone change the userspace
>> utilities because of this. Maybe they cared about latency here, I don't
>> know. That would be an example reason for them to merge it which would
>> NOT be a reason to create a new command line utility to use it.
> 
> totally agree, but in getrandom() case it's a new behaviour, blocking,
> before we modify anything we have to ensure we will not break anything,
> that is why i'm introducing a new applet, not modifying existing code

Above I actually _tried_ it, and it did not help.

>> It is not OUR job to figure out this "maybe". You are arguing for a
>> change. If the kernel maintainers had good reasons, link us to their
>> explanation here.
> 
> Also agree that it's my job to explain why busybox need it
> 
> Link to the introduction to getrandom()
> https://lwn.net/Articles/605828/

I opened  the first couple articles you linked, read around 20
screenfulls of text, and was not convinced. Asking me do extensive
research on your behalf is not helpful.

> getrandom is a new behaviour between /dev/urandom and /dev/random,

No, it _isn't_. I just _tried_ it and the blocking lasted over 2 minutes
in a trivially reproducible test environment.

You're either going to timeout and fall back to something else, or
you're going to initialize from an external source of entropy. Blocking
for an unknown amount of time is the same whether it's a read from
/dev/random  or a call to this function.

> so we can't transparently introduce it whitout breaking stuff.
> busybox target the embeded where /dev/urandom take a long time to be
> properly seeded

So far your new approach provides no obvious benefit.

>>>> BTW, add to that /proc/net/*. I mean, literally every file (unlike /proc/*,
>>>> where you don't want to read everything, "kcore" for one may be "a bit" 
>>>> big).
>>>> There are ~50 files in /proc/net/.
>>>> Just one example. /proc/net/unix has Inode column for unix sockets.
>>>> Those depend on the order how processes have started at boot.
>>>> If there is at least some concurrency, inodes will contain some randomness.
>>>
>>> Network and radio can add some entropy,
>>
>> Externally visible entropy, if you're up against our out-of-control spy
>> agencies it doesn't necessarily help. (Although if they can park a block
>> away and point a yagi antenna at your router there's all sorts of fun
>> info they can get out of it if you haven't put the sucker in a farraday
>> cage.)
> 
> You say it will contain some randomness, but how much?

More than that system previously had.

> Why try to reinvent the wheel and risk being unsafe
> when we have an API that does exactly what we need

What do you think your API does? Have you actually tried it?

>> Sigh, my grandfather worked for the NSA for 40 years (not _entirely_
>> voluntarily, he did crypto during world war II and then they threatened
>> to draft him and put him on the ground in Korea if he didn't voluntarily
>> re-enlist for this new agency they were starting) and although he still
>> won't talk about most of what he did the one takeaway I did get is
>> nobody is really properly paranoid and I will never take any job that
>> requires any sort of security clearance because I refuse to get it on me.
>>
>> But sure, tell me about crypto. Go for it.
> 
> Ok you know about crypto.

No, just more than you seem to. Enough to have a little perspective
about when to defer to the experts and when not to.

> I agree that cat /proc/* > /dev/urandom (or only a few /proc files)
> will add some entropy, but how much? am i crypto safe after?
> getrandom() API ensure that, do you?

Who are you defending against? What is your threat model? If it's the
$NSA (or whatever TLA they've outsourced their functions to now that
group's no longer No Such Agency but some sort of public outreach
program) considering you a person of interest, they probably already
backdoored your hardware before delivery and there is NOTHING you can do
in software to fix that.

http://static1.1.sqspcdn.com/static/f/543048/26931843/1464016046717/A2_SP_2016.pdf
http://sharps.org/wp-content/uploads/BECKER-CHES.pdf

If you're running your crypto on a system with "system management mode"
or using
http://boingboing.net/2016/06/15/intel-x86-processors-ship-with.html
then you're already kinda questionable:

http://arstechnica.com/security/2013/10/meet-badbios-the-mysterious-mac-and-pc-malware-that-jumps-airgaps/
http://www.wired.com/2015/02/nsa-acknowledges-feared-iran-learns-us-cyberattacks/
https://www.wired.com/2014/07/usb-security/
http://hackaday.com/2013/08/02/sprite_tm-ohm2013-talk-hacking-hard-drive-controller-chips/

And so on and so forth...

If your threat model is "my hardware budget is too cheap to give me any
hardware entropy sources in a mass-produced commercial device, but I
don't want every box to have the same darn key", there are various
mitigation options which all suck to different degrees.

If you're trying to increase the cost of mass surveilance so our out of
control "security" agencies have to focus on actual targets rather than
grabbing everything from everyone and keeping it forever, then
incremental steps can help and mostly "not being a plaintext
monoculture" is your most effective thing.

You seem to consider this a black and white "if I do this thing than I
am safe" which is not how any of this has ever worked.

Your new command is not magic. What does, or doesn't, it actually
accomplish?

>>> everything else is pretty non random.
>>> You are reading the same flash, executing the same code, with the same
>>> 1 core cpu, and without a high precision timer.
>>
>> CPU cycle counter is a HPET. It's not a _good_ hpet, but the skew
>> between different timers in the system can provide jitter that can be
>> visible in the cycle counter. (It's not necessarily protected from other
>> users on the box, but if you've got other users on the box executing
>> arbitrary code on a _router_, you're doing it wrong.)
> 
> And it's already what is used for the entropy collections

Which is why the above example didn't hang forever and merely hung for
two and a half minutes.

>>> Crypto 101 is "don't roll your own crypto", the code you are
>>> suggesting will make you feel safe when in reality it will change
>>> almost nothing
>>
>> Yes, and that attitude is why heartbleed happened. Because
>> non-cryptographers never reviewed that code for over a decade and it
>> filled up with badly implemented crap.
>>
>> We are not proposing rolling our own crypto algorithms, we are proposing
>> staying with the tried and tested APIs to the existing kernel crypto
>> algorithms until such a time as someone convincingly argues that a new
>> way of doing things is a significant improvement. Which you have not yet
>> done.
> 
> Using an unseeded CSPRNG is bad, and can leed to duplicate ssh key
> getrandom allow to wait until it's properly seeded without loosing
> time waiting for /dev/random

Hence seeding it from the nanoseconds of the system clock, the mac
address, the hard drive's load cycle counter (or whatever flash's
version of "smart" data is, I know they track buckets of that for wear
leveling), and so on to at least make it not produce outright collisions.

> getrandom is just standard /dev/urandom + waiting for it to be properly seeded

Which you can do by reading a byte from /dev/random.

> You are proposing with Denys to cat /proc/<some files> > /dev/urandom
> that is what i call "rolling your own crypto", you are not sure it's
> safe but let's do it

No, that was an example.

You're proposing a solution. The problem your solution supposedly fixes
keeps changing. This is not encouraging.

You have a hammer and everything looks like a nail, but the API was
added for a specific package ported from another operating system.
You're not working within a shared library trying to mitigate potential
resource exhaustion attacks and determining when to fall back to a
software random number generator.

>>> Please also reread Bastian Bittorf, in his experience on 100 identical
>>> routers you have at least 2 identical ssh-key !
>>
>> Which is why you mix in the mac address even though it's externally
>> visible. (The point of the hashing step is that mixing in known data
>> doesn't give you a known result as long as there's _some_ unknown data
>> in there. That at least makes the result unique and avoids collisions.)
> 
> between 2 identical routers the mac might only have 1 bits of difference,

Which washed through any hashing function worth its salt will change the
entire thing.

$ echo -ne '\00' | sha1sum
5ba93c9db0cff93f52b521d7420e43f6eda2784f  -
$ echo -ne '\01' | sha1sum
bf8b4530d8d246dd74ac53a13471bba17941dff7  -

You were complaining about collisions. Unpredictability by bad guys with
trillion dollar budgets is a separate issue.

> i want 128, non externally visible, i want to be crypto safe, without
> loosing 30sec waiting for /dev/random

You're aware that "crypto safe" is a l33t script kiddie term, right? It
is not a term of art among cryptographers that I am aware of (except for
determination of which semi-primes to use).

If you're going to dig into how crypto actually works (ignoring timing
and side channel attacks for the moment) you get into stuff like:

  https://lwn.net/Articles/687494/

Where the important thing is to understand what the algorithm _does_.
Speaking of which, the best explanation of public key crypto I've found
so far is:

  http://www.muppetlabs.com/~breadbox/txt/rsa.html

>> Although on _top_ of that you really want to see a unique random block
>> on each machine (which is a manufacturing issue), and if you can't do
>> that then you can't give the machine a unique ssh host key either (yes
>> I've dealt with this manufacturing issue at more than one company).
> 
> I can now patch the bin generating the ssh keys to use getrandom, or just call
> getrandom 1 > /dev/null
> before I call it and i'm safe

If by 'safe" you mean "waiting over 2 minutes".

>>
>> AND if you have _any_ persistent storage you can save previous entropy
>> and feed it back in on next boot (keeping in mind that routers are
>> _never_ shut down properly and you don't want them to be brickable, but
>> you can reserve a small SD card partition or similar to update once the
>> entropy pool reports full and then daily after that. _IF_ this issue is
>> considered important enough, which depends on how much input engineers
>> have into the design vs management and marketing and legal.)
> 
> I just pushed that into LEDE (OpenWrt)

I first heard about that idea back in the 90's, which was almost 20
years ago. You're saying it wasn't doing that before now?

> save a seed on startup once /dev/urandom is properly initialized
> (using getrandom())
> not on shutdown because it's unreliable, not every day because it's useless

Depends on the attack profile you think you're defending against. :)

>>
>> That said, the fundamental problem of routers having limited entropy
>> sources is not going to be solved by ANY software. Your function call
>> api and /dev/random have the exact same problem there so it's a red
>> herring. Initializing the pool from locally unique sources and
>> preserving accumulated entropy across reboots are mitigation strategies,
>> not solutions to the fundamental problem.
> 
> Your point is that /dev/random is broken? please report your findings 
> upstream.
> 
> Entropy collection is slower on embeded devices, so you have to wait for it, 
> and
> getrandom() claim to do exactly that

By adding 3 minutes to the boot time?

Look, I'm tired of this. It's not my call if this goes into busybox,
it's Denysy's. I was just readign to see if it was a thing I should do
in toybox. But I don't see what the point is of adding it to either
package. If you want this command it's 5 lines of code and you can
compile it yourself. I do not see what purpose it serves.

I'm going to start skimming now.

>> What is your use case? Why are you bothering to do this?
>>
>> It's entirely possible your new approach is superior, but you have not
>> successfully articulated _why_ yet. Would you like to try again?
> 
> Of course :)
> After heartbleed openbsd people forked openssl in libressl, and
> started to clean things up,
> they saw that there was no good way to get entropy in the linux world
> (file descriptor exhaution)

That's like saying there's no good way to get _input_ in the linux
world, due to file descriptor exhaustion. No good way to get data from
disk! No good way to get data from the network! Woe is us! Calamity!
It's impossible to implement ps because it needs /proc to be mounted!

Seriously?

If your function allocates memory, it should be able to return failure
if it couldn't. If your function opens a file, it should be able to
return failure if it couldn't. Why is this a hard concept?

>, so they asked to something similar than
> the getentropy() bsd syscall, and getrandom() was born.

BSD had a thing. This package was written and maintained by OpenBSD.
Therefore they wanted Linux to have that thing to avoid having to change
their code.

> Theodore Ts'o while at it added the possibility to block until
> /dev/urandom as been seeded with 128bits of entropy, specifically
> targetting embeded systems
> https://lwn.net/Articles/605828/

Yes, I read it. I am not convinced. I tried it and got almost 3 minutes
of blocking. (Reading 1 byte from /dev/random with dd gave me 6 minutes
of blocking, but I suspect the wget initialized the random state a bit.
Busybox dd was already in the image so I didn't need to wget it...)

Maybe Denys is convinced, go ask him. I'm moving on.

Rob
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Re: [PATCH] getrandom: new applet

Reply via email to