Re: isUniformRNG

Joseph Rushton Wakeling via Digitalmars-d Sun, 11 May 2014 05:19:25 -0700

On 11/05/14 05:58, Nick Sabalausky via Digitalmars-d wrote:

The seed doesn't need to be compromised for synchronized RNGs to fubar security.

Before we go onto the detail of discussion, thanks very much for the extensiveexplanation. I was slightly worried that my previous email might have comeacross as dismissive of your (completely understandable) concerns. I'm actuallyquite keen that we can find a mutually agreeable solution :-)

It's bad enough just to accidentally generate the same byte sequence for
multiple things:

struct CryptoRng { ulong internalState; ...etc... }
CryptoRng myRand;

// Oops! Duplicates internal state!
ubyte[] getRandomJunk(CryptoRng rand) {
     return rand.take(64).array();
}

// Oops! These are all identical! Passwords may as well be unsalted!
auto saltForStevensPassword =  getRandomJunk(myRand);
auto saltForSallysPassword =  getRandomJunk(myRand);
auto saltForGregsPassword =  getRandomJunk(myRand);

// Oops! Also identical!
auto oneTimeURL1 =  "https://verify-email/a@a.a/"~getRandomJunk(myRand);
auto oneTimeURL2 =  "https://verify-email/b@b.b/"~getRandomJunk(myRand);
auto oneTimeURL3 =  "https://verify-email/c@c.c/"~getRandomJunk(myRand);

May as well have just defined getRandomJunk as a constant ;)

Sure, but this is a consequence of two things (i) CryptoRng is a value type and(ii) it gets passed by value, not by reference.

In your example, obviously one can fix the problem by having the functiondeclared instead as


    ubyte[] getRandomJunk(ref CryptoRng rand) { ... }

but I suspect you'd say (and I would agree) that this is inadequate: it'srelying on programmer virtue to ensure the correct behaviour, and sooner orlater someone will forget that "ref". (It will also not handle other cases,such as an entity that needs to internally store the RNG, as we've discussedmany times on the list with reference to e.g. RandomSample or RandomCover.)

Obviously one _can_ solve the problem by the internal state variables of the RNGbeing static, but I'd suggest to you that RNG-as-reference-type (which doesn'tnecessarily need to mean "class") solves that particular problem without theconstraints that static internal variables have.

Maybe not *too* terrible for randomized alien invader attack patterns (unless it
makes the game boring and doesn't sell), but a fairly major failure for security
purposes.

There's another issue, too:

First a little background: With Hash_DRBG (and I would assume other crypto RNGs
as well), the RNG is periodically reseeded, both automatically and on-demand.
This reseeding is *not* like normal non-crypto reseeding: With non-crypto
reseeding, you're fully resetting the internal state to a certain specific value
and thus resetting the existing entropy, replacing it with the fresh new
entropy. But with Hash_DRBG, reseeding only *adds* additional fresh entropy -
nothing gets reset. Any existing entropy still remains. So reseeding two
different Hash_DRBGs with the *same* reseed data *still* results in two
completely different internal states. This is deliberately defined by the NIST
spec.

There are reasons for that, but one noteworthy side-consequence is this: If all
your Hash_DRBG's share the same internal state, then anytime one is reseeded
with fresh entropy, they *all* benefit from the fresh entropy, for free.

And as a (maybe smaller) additional benefit, if different parts of your program
are using an RNG independently of each other, but the RNGs internally share the
same state, then *each* component contributes additional (semi-)non-determinism
to the pool. That part's true of other RNGs too, but the effect is compounded
further by Hash_DRBG which updates the internal state differently depending on
how much data is requested at each, umm...at each request.

This is a very interesting point. My understanding is that in many (most?)programs, each individual thread does typically operate using a single RNGinstance anyway. The question for me is whether that's a design choice of thedeveloper or something that is an unavoidable outcome of the RNG design.

For example, it's readily possible to just instantiate a single RNG (crypto orotherwise, as required) at the start of the program and pass that around: ifit's a reference type (as I would argue it should be) then the effects youdesire will still take place.

Alternatively, one can also implement a thread-global RNG instance whichfunctions will use. This is already done with rndGen, which is effectively asingleton pattern. There's no reason why one couldn't also define a cryptoGenwhich works in a similar way, just using an instance of a cryto RNG where rndGenuses a Mersenne Twister:

https://github.com/D-Programming-Language/phobos/blob/d2f2d34d52f89dc9854069b13f52523965a7107e/std/random.d#L1161-L1174
https://github.com/WebDrake/std.random2/blob/a071d8d182eb28516198bb292a0b45f86f4425fe/std/random2/generator.d#L60-L81

The benefit of doing it this way, as opposed to static internal variables, isthat you aren't then constrained to have only one single, effectivelythread-global instance of _every_ RNG you create.

However...the mentions of Minecraft did raise one point that I'll concede: If
you have a multiplayer environment (or more generally, a large chunk of shared
multi-computer data) that's randomly-generated, then a very careful usage of a
shared seed and RNG algorithm could drastically reduce the network bandwidth
requirements.

You may also recall earlier games like Elite which were able to "store" wholeuniverses on a single 5 1/4" floppy by virtue of auto-generating that universefrom a repeatable random sequence. It was amazing what could be done on the oldBBC Micro :-)

I do agree strongly that there's a major difference between "relying on
deterministic sequence from an RNG" and "principle of design".

However, I also maintain that the "principle of design" case demands you should
*NOT* get the same results from RNGs unless you *specifically intend* to be
"relying on deterministic sequence from an RNG". After all, they *are* supposed
to be "random number generators". That's their primary function.

I do believe strongly in the same principles of encapsulation that you do. But
the thing is, the usual rules of proper encapsulated design are deliberately and
fundamentally predicated on the assumption that you *want* reliably predictable
deterministic behavior. The vast majority of the time, that's a correct
assumption - but the whole *point* of an RNG is to *not* be deterministic. Or at
least to get as close to non-determinism as reasonably possible within a
fundamentally deterministic system.

Since the basic goals of RNGs are in fundamental disagreement with the
underlying assumptions of usual "well encapsulated" design, that calls into
question the applicability of usual encapsulation rules to RNGs.

At the risk of being presumptuous, I think you do agree on some level. Isn't
that the whole reason you've proposed that RNGs change from structs to classes?
Because you don't want identical sequences from RNGs unless *specifically*
intended, right? The only parts of that I've intended to question are:

I completely agree that _unintended_ identical RNG sequences are a very badthing that must be avoided. I think that we're in disagreement only about howbest to achieve that in a sane/safe/uncomplicated way. With luck we will findsomething that works for both of us :-)

What I feel is that static internal variables go too far in the oppositedirection: they make it impossible to have even _intentionally_ identical RNGsequences. There's no way, for example, to .save or .dup an RNG instance withthis design.

However, the effect of a common and reliable source of randomness, used by allparts of the program, can be implemented instead by a thread-global singletoninstance of an RNG as described above. This seems to me to be a better solutionas it gives you what you achieve using static variables, but without preventingyou from instantiating as many other independent RNG instances as you like.

A. Does that really necessitate moving to classes?

No, the important distinction is value vs. reference types, not structs v.classes. One could implement the RNGs as struct-based reference types.

I wrote a class-based implementation because I wanted to see how that came outto various struct-based approaches monarch_dodra and I had tried. I do thinkthat it offers much value in terms of simplicity and elegance of design, butthere may be other costs that make it untenable. I am not sure the correctchoice will be apparent until the current work on memory allocation settles down.

B. Are there any legitimate use-cases for "specifically relying on deterministic
sequence from an RNG"?

Well, when I was writing large-scale scientific simulations, it was certainlyuseful to be able to re-run them with the same random sequence. Given thenumber of random variates I was generating in the process, it was not reallypractical to store the sequence on memory or disk. The same is true of manycircumstances where one is writing a program that relies on randomness, butwants to test it reliably.

However, fundamentally this comes down to a design point. Pseudo-random RNGsare _always_ deterministic in their operation once seeded. The difference we'rereally talking about here is whether the source of randomness used by differentparts of a program is thread-global or not.

And so, the question becomes rather, "Is it legitimate to enforce athread-global source of randomness in all circumstances?" I'd say the answer tothat is a definite no.

Bear in mind that one nasty consequence of using static internal variables toforce thread-global behaviour is that it means that the presence or absence ofunittests might affect the actual running program. Consider:


    struct SomeRNG { private static InternalState _internals; ... }

    unittest
    {
        SomeRNG rng;
        rng.seed(KnownSeed);
        // verify that SomeRNG produces expected sequence of variates
    }

Now when you start your _actual_ program, the internal state of SomeRNG willalready have been set by the unittests. Obviously if you are doing thingsproperly you will scrub this by re-seeding the RNG at the start of your programproper, but the risk is clearly still there.

For B, the above example of "reducing bandwidth/storage requirements for a large
set of shared random-generated data" is making me question my former stance of
"No, there's no legitimate use-case". If B falls (and I suppose it maybe it
does), then A does too, and classes would indeed be necessary.

Yes, I think that is one very good example of a use-case, which is probably oneshared between a bunch of different applications in practice (scientificsimulation, games, probably other stuff too).

As I said before, I don't think it mandates _classes_, but it does mandatereference-type RNGs that are properly encapsulated.

But that's all for non-crypto RNGs. For crypto-RNGs, determinism is *never*
anything more than a concession forced by the deterministic nature of the
machines themselves. FWIW, this does mean  crypto-RNGs can go either struct or
class, since the internal states really should be tied together anyway (in order
to prevent their *outputs* from being tied together).

What do you think about the global-singleton-instance pattern as a way ofachieving what you want?

I think I will stop my reply here for now, because I think that your response tothis question (and the explanations leading up to it) probably determines how Iwill respond to your later remarks. What are your thoughts?


Best wishes,

    -- Joe

Re: isUniformRNG

Reply via email to