On Mon, Jun 18, 2012 at 7:12 PM, Marsh Ray <ma...@extendedsubset.com> wrote: > On 06/18/2012 12:20 PM, Jon Callas wrote: >> >> >> A company makes a cryptographic widget that is inherently hard to >> test or validate. They hire a respected outside firm to do a review. >> What's wrong with that? I recommend that everyone do that. >> Un-reviewed crypto is a bane. > > > Let's accept that the review was competent, thorough, and independent. > > Here's what I'm left wondering: > > How do I know that this circuit that was reviewed is actually the thing > producing the random numbers on my chip? Why should I assume it doesn't > have any backdoors, bugdoors, or "engineering revisions" that make it > different from what was reviewed? > > Is RDRAND driven by reprogrammable microcode? If not, how are they going > to address bugs in it? If so, what are algorithms are used by my CPU to > authenticate the microcode updates that can be loaded? What kind of > processes are used to manage the signing keys for it? > > Let's take a look at the actual report: > http://www.cryptography.com/public/pdf/Intel_TRNG_Report_20120312.pdf > >> Page 12: At an 800 MHz clock rate, the RNG can deliver >> post-processed random data at a sustained rate of 800 MBytes/sec. In >> particular, it should not be possible for a malicious process to >> starve another process. > > > Wait a minute... that second statement doesn't follow from the first. > > We're talking about chips with a 25 GB/s *external* memory bus bandwidth, > why can't a 4-core 2 GHz processor request on the order of 64 bits per > core*clock for 4*64*2e9 = 512e9 b/s = 60 GiB/s ? > > So 800 MiB/s @ 800 MHz > = 7.62 clocks per 64 bit RDRAND result (if they mean 1 core) > = 30.5 clocks per 64 bit RDRAND (if they mean 4 cores). > = 61.0 clocks per 64 bit RDRAND (if they mean 8 hyperthread cores). > > More info: >> >> >> http://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide/ >> Data taken from an early engineering sample board with a 3rd generation >> Intel Core family processor, >> code-named Ivy Bridge, quad core, 4 GB memory, hyper-threading enabled. >> Software: LINUX* Fedora 14, >> GCC version 4.6.0 (experimental) with RDRAND support, test uses p-threads >> kernel API. > > > Why does an Intel "Software Implementation Guide" have more information > about the actual device under test than the formal report? > >> Measured Throughput: >> Up to 70 million RDRAND invocations per second > > > Implies: > 4 cores @ 2 GHz -> 114 clocks per RDRAND > 8 cores @ 2 GHz -> 229 clocks per RDRAND > >> 500+ million bytes of random data per second > > > Implies: > rate >= 62.5e6 RDRAND/s > t(RDRAND) <= 16 ns > <= 32 clocks, 1 core @ 2 GHz > <= 256 core*clocks, 8 cores @ 2 GHz > >> RDRAND Response Time and Reseeding Frequency >> ~150 clocks per invocation >> Note: Varies with CPU clock frequency since constraint is shared data >> path from DRNG to cores. >> Little contention until 8 threads – or 4 threads on 2 core chip >> Simple linear increase as additional threads are added > > > So when the statement is given "it should not be possible for a malicious > process to starve another process" without justification, it leads us to ask > "why should it not be possible to starve another process?" > > Maybe this is our answer: Because this hardware instruction is slow! > > 150 clocks (Intel's figure) implies 18.75 clocks per byte.
then perhaps this is the proper graph to examine. http://bench.cr.yp.to/graph-sha3/8-thumb.png maybe this is the case RDRAND is intended for? producing random 8-byte seeds? > > It would appear that the instruction is actually a blocking operation that > does not return until the request has been satisfied. Or does it? > > Can we then expect it will never result in an out-of-entropy condition? > > What happens when 16 cores are put on a chip? Will some future chip begin > occasionally returning zeroes from RDRAND when an attacker fires off 31 > simultaneous threads requesting entropy? Or will RDRAND take 300 clocks to > execute? > > Note that Skein 512 in pure software costs only about 6.25 clocks per byte. > Three times faster! If RDRAND were entered in the SHA-3 contest, it would > rank in the bottom third of the remaining contestants. > http://bench.cr.yp.to/results-sha3.html > > So perhaps we should not throw away our software-stirred entropy pools just > yet and if RDRAND is present it should be used to contribute 128 bits or so > at a time as just one of several sources of entropy. It could certainly help > to kickstart the software RNG in those critical first seconds after cold > boot (you know, when the SSH keys are being generated). > > - Marsh > > _______________________________________________ > cryptography mailing list > cryptography@randombit.net > http://lists.randombit.net/mailman/listinfo/cryptography -- Kyle Creyts Information Assurance Professional BSidesDetroit Organizer _______________________________________________ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography