> Date: Tue, 31 Jan 2017 12:00:29 -0500 > From: Thor Lancelot Simon <t...@panix.com> > > Maybe we should alert in a more sophisticated way -- monitor the failure > rate and alert only if it is significantly above the expected rate. > > I might even remember enough statistics to do that.
For production, if you believe rngtest is correctly implemented to have false rejection rate 3/10000 on the null hypothesis of uniform random source data (I assume this mean average three failures in every ten thousand trials, where each trial tests a 20000-bit block of data as specified in src/sys/sys/rngtest.h?), you might as well just tweak it so it has a much lower false rejection rate. However, if you are not sure rngtest as implemented has the intended false rejection rate alpha = 3/10000, and you want to test it empirically, you could run it on what you know to be a uniform random source and stochastically test the probabilistic program that rngtest is, along the lines of Alexey Radul, `On Testing Probabilistic Programs', Alexey Radul's blog, April 29, 2016. http://alexey.radul.name/ideas/2016/on-testing-probabilistic-programs/ A test of the null hypothesis alpha <= 3/10000 that has false rejection rate 5% and true rejection rate 80% under the alternative hypothesis that alpha >= 4/10000 (typical choices for statistical significance and statistical power in undergrad statistics and run-of-the-mill scientific publications), requires 213 294 trials, according to Alexey's script there. Depending on how fast rngtest runs, that may not be terrible for automatic tests -- my laptop's CPU can generate enough source data with software AES-CTR in 3 seconds -- except that if we put it into the atf tests, on average one in twenty runs would spuriously fail, which means we'd see spurious failures every day or two. So we really want the false rejection rate to be closer to .001% -- one in a hundred thousand, or about three in a century at our current rate of releng atf runs -- which requires 885 968 trials. If we want true rejection rate 80% under the stronger alternative hypothesis that alpha >= 3.1/10000, it requires 72 282 164 trials, which is quite excessive on my laptop's CPU -- 10-15 minutes just to generate the source data -- and is doubtless totally unacceptable on slower hardware.