Re: RandomSample with specified random number generator

Joseph Rushton Wakeling Sun, 17 Jun 2012 09:52:38 -0700

On 17/06/12 17:08, Artur Skawina wrote:

The bug description and cause makes sense, thanks for the explanation.


But the problem is that this kind of bug inside a module which is supposed
to generate pseudo-random data makes it very hard to trust _any_ result
given back by the code...

Sure. When I was working on this I did spend a fair amount of time scratchingmy head over how you could create _really_ effective unittests for random-numberfunctionality, without stretching out the time required too greatly. I don'tthink sufficient tests are in place, though probably really rigorous tests ofpseudo-random number generation would take longer than unittests are supposed to.

So let's fix the already discovered bug:

I'm feeling a bit braindead today so I may have misunderstood your code, but I'mnot sure your fix actually does fix the problem identified. You could check outJerro's pull request for an alternative:

https://github.com/D-Programming-Language/phobos/pull/542

Now the result is:

    [0, 7568, 7476, 0, 7494, 7500, 7461, 7504, 7527, 7470]

ie still not quite what you'd expect...


If you've got time, you might like to pull from my master branch:
https://github.com/WebDrake/phobos

... and check if the same bug arises.  I made exactly this kind of test.

This is *not* a RNG, it's a PRNG - the results must always be completely
repeatable, just like you say in your first message. Of course having
a mode that improves the randomness is ok and should probably even be the
default. But if a PRNG is seeded with a known value then it must behave
completely predictable.

I think you've slightly misunderstood what I meant. Let's say we create arandom sample range:


    auto sample = randomSample(/* whatever input */);

... then there are two perfectly logical and acceptable ways to handle its lazyevaluation.


The first is that each time you evaluate it produces the exact same result, i.e.

    writeln(sample);
    writeln(sample);
    writeln(sample);

... will produce identical output 3 times. The alternative is that each time anew random sample is generated, i.e. each time we


    writeln(sample);

... we get a different sample. This is still predictable, because the sampleswill derive from the same sequence of pseudo-random numbers, each new samplepicking up the pseudo-random sequence where the last one left. Assuming thesequence's approximation of randomness to be good enough, you'll get properlyindependent samples each time.

To me either of these possibilities is acceptable -- they're both logical andpredictable -- but the behaviour should be the same whether or not randomSampleis called with a specific RNG.

Re: RandomSample with specified random number generator

Reply via email to