Stephan Mueller <smueller <at> chronox.de> writes:

> And here the RNG theory breaks: a whitening function (crypto function) 
> like the used SHA1 does not add entropy. Thus, the SHA1 just spreads out 
> the entropy evenly over the output buffer. As entropy can be considered 

That’s why you also use a (faster, less powerful) mixing function
on input.

I was wondering: why not use Jenkins’ one-at-a-time hash there?
I’ve worked with it a bit, and it has good avalanche behaviour;
I didn’t measure any cycles though.

Basically, h be an uint32_t register (usually loaded from a
memory location and stored to one afterwards), then you do:

    (h) += (uint8_t)(b);
#ifdef with_mirabilos_changes
    ++(h);
#endif
    (h) += (h) << 10;
    (h) ^= (h) >> 6;

I’m adding 1 after the first step (adding the byte) in order
to make even NUL bytes make up a difference.

I’m toying with code that has 32 such uint32_t values, making
for a total of 128 Bytes of state, and when asked to add some
memory content into that pool, just distribute the bytes over
the values array, incrementing a counter.

(One could probably do something with cmpxchg to make that
lockless, but I’m not doing any parallel programming myself.)

Then, every once in a while, you’d run the finish function
on every value, which is:

#ifdef with_mirabilos_changes
    (h) += (h) << 10;
    (h) ^= (h) >> 6;
#endif
    (h) += (h) << 3;
    (h) ^= (h) >> 11;
    (h) += (h) << 15;

My change here is because Jenkins’ OAAT has good avalanche
except for the last input byte, so I just fake adding NUL
(which doesn’t get avalanched so well, but doesn’t matter)
to get the last actual data byte avalanched.

Then I write those finialised hash values back to the
128-Byte-buffer then I add that into the main pool using
a cryptographic (hash, or stream cipher rekey) function.
(In my case, arc4random, if someone wonders.)

> >It doesn't matter if you collect predictable data - it neither helps
> 
> Oh yes, it hurts, if you update the entropy estimator on those 
> predictable bits. Because then you get a deterministic RNG like 

I’ve seen Theodore apply exponential backoff to any estimation,
which is probably a good thing. But yes, you probably will want
to guess the entropy of the bytes added and not count things
where you’re not too sure of. (In the interrupt case, we have
jiffies, cycles and num, so we’d probably best estimate how
often interrupts are called, base a number of “timing” bits
on that, and account for num; this may very well be less than
8 bit “credited” for a long and two u_int added to the pool,
but as long as you get that right and don’t blindly credit
a byte as 8 bit, you’re safe. To quote the author of RANDOM.SYS
for DOS: “Every bit counts.” It adds uncertainty to the pool,
at least by stirring around the already-existing entropy without
adding any new (creditable) entropy.)

bye,
//mirabilos

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to