Hi!
Thank you for the reply.
Hum... randomShuffle and randomSample actually have nothing to
do with each other.
<snip>
I'd like to note that my post is about randomCover, not
randomSample. I do see the difference between the purpose of
randomSample and randomShuffle. But randomCover's effect is, at
the first glance, just a slower version of randomSample wrapped
as a lazy generator.
I also want to comment on your "randomSample" vs "randomSuffle"
implementation suggestion. Keep in mind that:
a) randomSample doesn't allocate, whereas yours suggestion
doesn't
b) randomSample gives direct access to the elements, whereas
your suggestion doesn't.
If you don't care about a) and b), then by all means, dup away,
and get better performance!
But think about the fact that you wouldn't be able to do
something like this...
<snip>
auto sample = randomSample(arr[], 5);
foreach(ref a; sample)
++a;
That stands for randomCover, too. Well, thank you, perhaps
that's the difference I was seeking.
If this is the intended difference, well, my proposition to
enhance randomCover's performance and usefulness transforms into:
1. Document the usage of randomCover with an example such as
above, and refer to randomShuffle as a faster version for simpler
use cases.
2. Optimize the performance by putting Fenwick trees to good use.
Currently, randomCover'ing 10,000 elements takes time on the
order of one second, and for 100,000 or more elements, it is
hardly usable.
Last but not least, be warned there is an old-standing bug with
anything in phobos that takes a PRNG by value. Basically, the
PRNG will be duplicated, and generate the same sequence over
and over. Ergo, do NOT pass a specific random generator to
things like randomSample or randomSuffle.
This problem is one of the major reason we are currently (and
slowly) re-designing random into random2.
So, there is a general agreement that in random2, RNG should by
default get passed by reference everywhere? That's nice to hear.
-----
Ivan Kazmenko.