Re: rnd entropy estimate running low?
> Date: Tue, 31 Jan 2017 12:00:29 -0500 > From: Thor Lancelot Simon> > Maybe we should alert in a more sophisticated way -- monitor the failure > rate and alert only if it is significantly above the expected rate. > > I might even remember enough statistics to do that. For production, if you believe rngtest is correctly implemented to have false rejection rate 3/1 on the null hypothesis of uniform random source data (I assume this mean average three failures in every ten thousand trials, where each trial tests a 2-bit block of data as specified in src/sys/sys/rngtest.h?), you might as well just tweak it so it has a much lower false rejection rate. However, if you are not sure rngtest as implemented has the intended false rejection rate alpha = 3/1, and you want to test it empirically, you could run it on what you know to be a uniform random source and stochastically test the probabilistic program that rngtest is, along the lines of Alexey Radul, `On Testing Probabilistic Programs', Alexey Radul's blog, April 29, 2016. http://alexey.radul.name/ideas/2016/on-testing-probabilistic-programs/ A test of the null hypothesis alpha <= 3/1 that has false rejection rate 5% and true rejection rate 80% under the alternative hypothesis that alpha >= 4/1 (typical choices for statistical significance and statistical power in undergrad statistics and run-of-the-mill scientific publications), requires 213 294 trials, according to Alexey's script there. Depending on how fast rngtest runs, that may not be terrible for automatic tests -- my laptop's CPU can generate enough source data with software AES-CTR in 3 seconds -- except that if we put it into the atf tests, on average one in twenty runs would spuriously fail, which means we'd see spurious failures every day or two. So we really want the false rejection rate to be closer to .001% -- one in a hundred thousand, or about three in a century at our current rate of releng atf runs -- which requires 885 968 trials. If we want true rejection rate 80% under the stronger alternative hypothesis that alpha >= 3.1/1, it requires 72 282 164 trials, which is quite excessive on my laptop's CPU -- 10-15 minutes just to generate the source data -- and is doubtless totally unacceptable on slower hardware.
Re: rnd entropy estimate running low?
On Tue, Jan 31, 2017 at 05:54:37PM +0100, Martin Husemann wrote: > On Tue, Jan 31, 2017 at 11:45:55AM -0500, Thor Lancelot Simon wrote: > > The only time we've ever really dug into it, I believe, the user decided > > the failures were right around the expected failure rate. Can you help > > gather more data? > > Good point, and I am not sure this might cover my case as well - will have > a look and start gathering better data. Hmm, for a simple start: Jan 31 16:00:16 night-owl /netbsd: Kernel RNG "7517 14 4" monobit test FAILURE: 9709 ones Jan 31 16:00:16 night-owl /netbsd: cprng 7517 14 4: failed statistical RNG test Jan 31 20:20:12 night-owl /netbsd: Kernel RNG "11702 2 8" runs test FAILURE: too many runs of 4 0s (395 >= 384) Jan 31 20:20:12 night-owl /netbsd: cprng 11702 2 8: failed statistical RNG test Jan 31 20:20:25 night-owl /netbsd: Kernel RNG "16549 4 9" runs test FAILURE: too few runs of 2 0s (1107 <= 1114) Jan 31 20:20:25 night-owl /netbsd: cprng 16549 4 9: failed statistical RNG test Jan 31 21:21:05 night-owl /netbsd: Kernel RNG "7429 2 9" poker test failure: parameter X = 2.8640 Jan 31 21:21:05 night-owl /netbsd: cprng 7429 2 9: failed statistical RNG test Jan 31 21:21:47 night-owl /netbsd: Kernel RNG "17166 149 10" long run test FAILURE: Run of 26 0s found Jan 31 21:21:47 night-owl /netbsd: cprng 17166 149 10: failed statistical RNG test This is my amd64 notebook while doing a few ssh sessions and a pkg_chk rebuild. Nothing involved that really (AFAIU) needs serious ammounts of entropy. Is there a simple way to get usage statistics? Martin
Re: rnd entropy estimate running low?
On Tue, Jan 31, 2017 at 05:54:37PM +0100, Martin Husemann wrote: > On Tue, Jan 31, 2017 at 11:45:55AM -0500, Thor Lancelot Simon wrote: > > The only time we've ever really dug into it, I believe, the user decided > > the failures were right around the expected failure rate. Can you help > > gather more data? > > Good point, and I am not sure this might cover my case as well - will have > a look and start gathering better data. Maybe we should alert in a more sophisticated way -- monitor the failure rate and alert only if it is significantly above the expected rate. I might even remember enough statistics to do that. -- Thor Lancelot Simon t...@panix.com Ring the bells that still can ring.
Re: rnd entropy estimate running low?
> Date: Tue, 31 Jan 2017 16:55:38 + > From: Taylor R Campbell> > This is roughly to be expected from any stochastic test that has > nonzero false positive rate. I have not computed exactly what the > false rejection rate is under the null hypothesis of uniform random > bits for these tests. Someone^TM should do that! > > (These are all classical frequentist hypothesis tests, mostly of > elementary chi^2, Binomial, , statistics on streams of ones and > zeros. If anyone wants a little probability theory and statistics > exercise, I'd be happy to point you in the right direction for how to > do this.) Well, apparently Thor already did this and came up with 3 false rejections for every 1...bits? 32-bit words? or something, so never mind!
Re: rnd entropy estimate running low?
> Date: Tue, 31 Jan 2017 17:16:33 +0100 (CET) > From: Havard Eidnes> > rnd: WARNING! initial entropy low (0). > rnd: entropy estimate 0 bits > rnd: asking source callout for 512 bytes > rnd: system-power attached as an entropy source (collecting) > mainbus0 (root) > cpu0 at mainbus0 core 0: 1536 MHz Cortex-A5 r0p1 (Cortex V7A core) > ... > > I'm assuming this is because this happens too early, the rng > device hasn't been detected so early in the boot process, and > there's no file system accessible either to re-initialize the > kernel rng from either at this stage, and the boot loader doesn't > have a way to work around this. The boot loader on x86, at least, can read a seed from the file system (/var/db/entropy-file) before the kernel even starts, and that should be fed in quite early. > (This is more a platform-specific problem, I think, and > tangential to what I discussed initially.) Right, but it's an important one! It is only in the system engineering that you can get the entropy initialization correct -- no amount of software can locally massage the inputs it has into a high-entropy state, if the inputs are no good. So for this platform we should try to see when the HWRNG devices attaches and whether anything needs to use entropy before then. > OK, I'll buy the crypto argument at face value. However, our code > still behaves differently depending on whether the entropy estimate > is able to "satisfy" the request being processed or not. So under > this description that is also a holdover from older versions of this > code? Yes. There are two useful functions for blocking on reads from /dev/random: 1. Waiting for initial entropy after the system to be booted, which may mean, e.g., waiting until the on-board HWRNG device has provided enough data. 2. Exercising the blocking code paths in applications, which would otherwise occur only sometimes at system boot. My draft rewrite of the entropy pool that I think you saw at the devsummit at EuroBSDcon in Stockholm changes the decision of when to block: if not enough entropy is available, block until it does; if enough entropy is available, and a coin toss comes up heads, block up to a second. > It may be coincidental, but this box when it sits otherwise > mostly idle and only does ntp for a long while sometimes logs > > Kernel RNG "231 0 1" monobit test FAILURE: 10300 ones > cprng 231 0 1: failed statistical RNG test > ... > > Admittedly, these are spread over a larger time period, and a > couple of them were the result of provocation by dumping data > from /dev/random with dd. This is roughly to be expected from any stochastic test that has nonzero false positive rate. I have not computed exactly what the false rejection rate is under the null hypothesis of uniform random bits for these tests. Someone^TM should do that! (These are all classical frequentist hypothesis tests, mostly of elementary chi^2, Binomial, , statistics on streams of ones and zeros. If anyone wants a little probability theory and statistics exercise, I'd be happy to point you in the right direction for how to do this.) However, if it happens repeatedly over a short period of time, you should be concerned that something is hosed in your kernel or HWRNG.
Re: rnd entropy estimate running low?
On Tue, Jan 31, 2017 at 11:45:55AM -0500, Thor Lancelot Simon wrote: > The only time we've ever really dug into it, I believe, the user decided > the failures were right around the expected failure rate. Can you help > gather more data? Good point, and I am not sure this might cover my case as well - will have a look and start gathering better data. Martin
Re: rnd entropy estimate running low?
On Tue, Jan 31, 2017 at 05:40:01PM +0100, Martin Husemann wrote: > On Tue, Jan 31, 2017 at 11:38:02AM -0500, Thor Lancelot Simon wrote: > > The statistical failures later in system run might indicate a memory > > integrity issue, a race condition of some kind, or just be the expected > > roughly 3/1 random occurrences. Hard to say without more information. > > I see statistical test failures on various hardware during later system run > a lot. Others have stated the same, IIRC. The only time we've ever really dug into it, I believe, the user decided the failures were right around the expected failure rate. Can you help gather more data? -- Thor Lancelot Simon t...@panix.com Ring the bells that still can ring.
Re: rnd entropy estimate running low?
On Tue, Jan 31, 2017 at 11:38:02AM -0500, Thor Lancelot Simon wrote: > The statistical failures later in system run might indicate a memory > integrity issue, a race condition of some kind, or just be the expected > roughly 3/1 random occurrences. Hard to say without more information. I see statistical test failures on various hardware during later system run a lot. Others have stated the same, IIRC. Martin
Re: rnd entropy estimate running low?
On Tue, Jan 31, 2017 at 05:16:33PM +0100, Havard Eidnes wrote: > >> Meanwhile the hardware random generator sits there unused. > > > > Does it sit there completely unused, or did it get used a little at > > boot time? > > It generated some bits at boot time, but apparently not early > enough, because on each reboot the kernel log looks like this: It looks like nothing's actually calling for bits except the start-up statistical test (which itself creates demand) before the hardware RNG attaches, so there shouldn't be a practical problem. The question is, could the hardware RNG attach earlier or the statistical test happen later -- or doesn't it matter? The statistical failures later in system run might indicate a memory integrity issue, a race condition of some kind, or just be the expected roughly 3/1 random occurrences. Hard to say without more information. Thor
Re: rnd entropy estimate running low?
>> Meanwhile the hardware random generator sits there unused. > > Does it sit there completely unused, or did it get used a little at > boot time? It generated some bits at boot time, but apparently not early enough, because on each reboot the kernel log looks like this: ... total memory = 1024 MB avail memory = 1007 MB sysctl_createv: sysctl_create(machine_arch) returned 17 rnd: callout attached as an entropy source (collecting) rnd: initialised (4096) with counter rnd: printf attached as an entropy source (collecting without estimation) rnd: autoconf attached as an entropy source (collecting) rnd: WARNING! initial entropy low (5). rnd: starting statistical RNG test, entropy = 6. rnd: statistical RNG test done, entropy = 6. rnd: entropy estimate 0 bits rnd: asking source callout for 512 bytes rnd: WARNING! initial entropy low (0). rnd: entropy estimate 0 bits rnd: asking source callout for 512 bytes rnd: system-power attached as an entropy source (collecting) mainbus0 (root) cpu0 at mainbus0 core 0: 1536 MHz Cortex-A5 r0p1 (Cortex V7A core) ... I'm assuming this is because this happens too early, the rng device hasn't been detected so early in the boot process, and there's no file system accessible either to re-initialize the kernel rng from either at this stage, and the boot loader doesn't have a way to work around this. (This is more a platform-specific problem, I think, and tangential to what I discussed initially.) >> I would have thought it would make more sense to keep the "bits >> currently stored in pool" more "topped up", and that a re-fill >> could with benefit be done before the estimate crept down towards >> zero? Especially if you have a half-way decent hardware random >> generator at hand? > > Actually, no. One basic conceit of modern symmetric cryptography is > that from a single small uniform random 256-bit secret, you can derive > an arbitrarily large uniform random secret. `Entropy depletion' does > not really exist as a meaningful concept in modern cryptography. > > The entropy accounting that we currently do is a holdover from days of > yore when the folklore supported it, but the natural information- > theoretic interpretation of the folklore actually leads to worse > attacks in practice -- see the rnd(4) man page for details. So while > we haven't gotten rid of the kooky accounting, it doesn't really mean > anything to see the numbers go down. > > There is a limit to the output produced by, e.g., AES-CTR, arising > from the PRP approximation to a PRF and the birthday paradox, and > there are some US federal government standards (NIST SP800-90A, in > particular) about PRNG constructions that Thor wanted to make it easy > to follow, which is why we rekey cprng(9) after a relatively small > amount of output -- but that happens much slower than the entropy > accounting you're looking at, and is not reported to userland. OK, I'll buy the crypto argument at face value. However, our code still behaves differently depending on whether the entropy estimate is able to "satisfy" the request being processed or not. So under this description that is also a holdover from older versions of this code? It may be coincidental, but this box when it sits otherwise mostly idle and only does ntp for a long while sometimes logs Kernel RNG "231 0 1" monobit test FAILURE: 10300 ones cprng 231 0 1: failed statistical RNG test ... Kernel RNG "15965 0 4" runs test FAILURE: too many runs of 4 1s (386 >= 384) cprng 15965 0 4: failed statistical RNG test ... Kernel RNG "27778 0 3" poker test failure: parameter X = 2.9280 cprng 27778 0 3: failed statistical RNG test ... Kernel RNG "6647 0 3" poker test failure: parameter X = 47.2720 cprng 6647 0 3: failed statistical RNG test ... Kernel RNG "24153 0 3" long run test FAILURE: Run of 29 0s found cprng 24153 0 3: failed statistical RNG test ... Kernel RNG "2551 0 4" poker test failure: parameter X = 47.60320 cprng 2551 0 4: failed statistical RNG test ... Admittedly, these are spread over a larger time period, and a couple of them were the result of provocation by dumping data from /dev/random with dd. Regards, - Håvard
Re: rnd entropy estimate running low?
> Date: Thu, 12 Jan 2017 21:13:03 +0100 (CET) > From: Havard Eidnes> > Meanwhile the hardware random generator sits there unused. Does it sit there completely unused, or did it get used a little at boot time? That's the most important time to use it; otherwise it doesn't really matter, unless you somehow know an attacker has witnessed the state of the kernel entropy pool, but otherwise expect the system to be uncompromised. > I would have thought it would make more sense to keep the "bits > currently stored in pool" more "topped up", and that a re-fill > could with benefit be done before the estimate crept down towards > zero? Especially if you have a half-way decent hardware random > generator at hand? Actually, no. One basic conceit of modern symmetric cryptography is that from a single small uniform random 256-bit secret, you can derive an arbitrarily large uniform random secret. `Entropy depletion' does not really exist as a meaningful concept in modern cryptography. The entropy accounting that we currently do is a holdover from days of yore when the folklore supported it, but the natural information- theoretic interpretation of the folklore actually leads to worse attacks in practice -- see the rnd(4) man page for details. So while we haven't gotten rid of the kooky accounting, it doesn't really mean anything to see the numbers go down. There is a limit to the output produced by, e.g., AES-CTR, arising from the PRP approximation to a PRF and the birthday paradox, and there are some US federal government standards (NIST SP800-90A, in particular) about PRNG constructions that Thor wanted to make it easy to follow, which is why we rekey cprng(9) after a relatively small amount of output -- but that happens much slower than the entropy accounting you're looking at, and is not reported to userland.
rnd entropy estimate running low?
Hi, on a couple of arm boxes I have I've been observing the development of the entropy estimate, what "rndctl -s" calls "bits currently stored in pool" over time. I've also tried to read some of the code to understand the behaviour. If I understand correctly, randomness sources come in two basic flavours: those which offer up randomness samples based on (possibly external) events, and those which only provide samples when "asked" to do so. The hardware randomness generator on my amlogic arm boards appears to fall into the last category. On a system with little other active randomness sources (e.g. FS activity or keyboard / mouse activity), it appears that the "bits currently stored in pool" will be allowed to decrease quite close to zero (or even *to* zero) before the polled sources are queried, via e.g. rnd_extract() only triggering a rnd_getmore() if it could not initially fulfill the request. The same also appears to hold for rnd_tryextract(). Meanwhile the hardware random generator sits there unused. I would have thought it would make more sense to keep the "bits currently stored in pool" more "topped up", and that a re-fill could with benefit be done before the estimate crept down towards zero? Especially if you have a half-way decent hardware random generator at hand? (This has been observed with both 7.99.47 and 7.99.58, fwiw.) Regards, - Håvard