On 17/06/12 17:08, Artur Skawina wrote:
The bug description and cause makes sense, thanks for the explanation.

But the problem is that this kind of bug inside a module which is supposed
to generate pseudo-random data makes it very hard to trust _any_ result
given back by the code...

Sure. When I was working on this I did spend a fair amount of time scratching my head over how you could create _really_ effective unittests for random-number functionality, without stretching out the time required too greatly. I don't think sufficient tests are in place, though probably really rigorous tests of pseudo-random number generation would take longer than unittests are supposed to.

So let's fix the already discovered bug:

I'm feeling a bit braindead today so I may have misunderstood your code, but I'm not sure your fix actually does fix the problem identified. You could check out Jerro's pull request for an alternative:
https://github.com/D-Programming-Language/phobos/pull/542

Now the result is:

    [0, 7568, 7476, 0, 7494, 7500, 7461, 7504, 7527, 7470]

ie still not quite what you'd expect...

If you've got time, you might like to pull from my master branch:
https://github.com/WebDrake/phobos

... and check if the same bug arises.  I made exactly this kind of test.

This is *not* a RNG, it's a PRNG - the results must always be completely
repeatable, just like you say in your first message. Of course having
a mode that improves the randomness is ok and should probably even be the
default. But if a PRNG is seeded with a known value then it must behave
completely predictable.

I think you've slightly misunderstood what I meant. Let's say we create a random sample range:

    auto sample = randomSample(/* whatever input */);

... then there are two perfectly logical and acceptable ways to handle its lazy evaluation.

The first is that each time you evaluate it produces the exact same result, i.e.

    writeln(sample);
    writeln(sample);
    writeln(sample);

... will produce identical output 3 times. The alternative is that each time a new random sample is generated, i.e. each time we

    writeln(sample);

... we get a different sample. This is still predictable, because the samples will derive from the same sequence of pseudo-random numbers, each new sample picking up the pseudo-random sequence where the last one left. Assuming the sequence's approximation of randomness to be good enough, you'll get properly independent samples each time.

To me either of these possibilities is acceptable -- they're both logical and predictable -- but the behaviour should be the same whether or not randomSample is called with a specific RNG.

Reply via email to