Miscellaneous notes -- I'm doing the benchmarking we discussed and pondering whether it is right to switch from hc-128 to salsa20, separately.
On Thu, Apr 17, 2014 at 09:33:28PM +0000, Taylor R Campbell wrote: > > > +void > > +hc128_init(hc128_state_t *state, const uint8_t *key, const uint8_t *iv) > > +{ > > + unsigned int i; > > + uint32_t w[1280], *p = state->p, *q = state->q; > > 5 KB on the stack is a lot! Granted, this is a leaf routine which in > our case will be called only in a softint handler, but still. Fixed. > For the purposes of a PRNG, conversion to little-endian is not > necessary. Would be nice for hc128_extract to just return uint32_t, > too. I'll benchmark this on a hardware, BE system (somehow; I don't have one that's easy to get running again). Whatever cipher we use for this I would prefer to leave its core transform alone, for several reasons. > Forward declaration of cprng_fast_randrekey should go among the other > forward declarations, not mixed up among the global state. Fixed. > Can we put the creation of kern_cprng and the initialization of > cprng_fast into cprng_init? If not, there should be a comment > explaining why not. No -- it needs the kernel_cprng ready. I added a comment. > Why generate the IV randomly? Why not a counter? Does the IV have > any requirements other than that it never be reused with the same key, > i.e. is it anything other than a nonce? Why not generate it randomly? We hardly ever do it and extraction from the kernel_cprng is cheap compared to the other keying operations. > Are there actually any callers of cprng_fast at IPL_HIGH? Are there > actually any legitimate random decisions to be made at IPL_HIGH? I'm > sceptical. What do you get if you cross an elephant and a rhinocerous? Given that what we do within the spl* takes very little time I am inclined to say what spl we go to hardly matters, and be very conservative. The real question here, I think, is whether we should spl*() at all, or forbid use of cprng_fast() from interrupt context entirely. > > +size_t > > +_cprng_fast_exact(void *p, size_t len) > > +{ > > This should KASSERT(len <= CPRNG_MAX_LEN) or something so nobody is > tempted to generate obscene amounts of data at once (at IPL_HIGH or > IPL_VM or whatever it turns out to be). Fixed. > > + ctx->numbytes += len; > > Check for arithmetic overflow and saturate. Can't get even close -- the max bytes on key is constrained to 2^29 on entry to the function. I fixed _cprng_fast_inexact per your comments. Thor