Re: Random sampling in Phobos

Lars T. Kyllingstad Tue, 17 Apr 2012 22:50:19 -0700

On Wednesday, 18 April 2012 at 01:17:18 UTC, Joseph RushtonWakeling wrote:

On 17/04/12 18:25, Joseph Rushton Wakeling wrote:
On 17/04/12 17:31, Andrei Alexandrescu wrote:
Actually that's not correct. RandomSample works fine with aninput range and
does not keep it in memory.
Ahh, OK. I should have anticipated this as the output alsoworks as a range.
A query on this point, as it looks like this can have someunfortunate side-effects.
If I write e.g.

    auto s = randomSample(iota(0,100), 5);

    foreach(uint i; s)
        writeln(i);

    writeln();

    foreach(uint i; s)
        writeln(i);
... then it's noticeable that the _first_ element of the twosample lists is identical, while the others change. If I putin place a specified source of randomness, e.g.
    auto s = randomSample(iota(0,100), 5, rndGen);

... then the numbers stay the same for both lists.
I'm presuming that the first element remains identical becauseit's defined in the constructor rather than elsewhere, but it'snot clear why passing a defined RNG produces identical outputon both occasions.
The effect can be worse with Vitter's Algorithm D becauseoften, having defined the first element, the second may bederived deterministically -- meaning the first 2 elements ofthe sample are identical.
I'm sure that the above use of the RandomSample struct is notthe advised use, but it's still disconcerting to see this.

I've run into this trap more than once. :) You have to pass therandom number generator by ref, otherwise you are just generatingtwo identical sequences of random numbers. Just change therandomSample signature to:

auto randomSampleVitter(R, Random)(R r, size_t n, ref Randomgen)


-Lars

Re: Random sampling in Phobos

Reply via email to