On Monday, 3 June 2013 at 17:35:22 UTC, Joseph Rushton Wakeling wrote:
On 06/03/2013 07:00 PM, Diggory wrote:
Thanks for testing before dismissing completely :P The way it returns results can be improved a lot by pre-allocating a range of the necessary size/using a
range passed in.

Surely. :-)

The complexity is O(N²) where N is the number of samples out of M. The algorithm is something I just thought of although I've possibly used it before and I'm sure someone else has thought of it too. It's quite simple, it effectively says "pick a value from the range except for this value" recursively but instead of dividing the range in two, it shifts the top part down first to make a contiguous range and then shifts the results which are past the split up one
afterward.

It's an interesting little piece of invention. I'll have to look around and see if there's anything similar documented. I've not seen anything in the literature which has this particular property of O(N) random variates plus
O(N^2) calculation.

The fact that it's not sequential might have been a bit of an issue historically, which maybe explains why I haven't come across something like it.

I expect the phobos implementation to be something like O(M) in which case my method would be faster whenever N < sqrt(M) (with the optimisations)

No, the Phobos implementation is O(N) -- I wrote it :-)  See:
http://braingam.es/2012/07/sampling-d/

The API is Andrei's -- the original implementation used what Knuth refers to as Algorithm S [which is really simple and easy to implement and check, but is indeed O(M)], I updated it to use Algorithm D (which is complicated:-).

So, since randomSample takes kN time where k is constant, your algorithm will
probably win where N < k.

I found that the sampling time was about equal with N = 50 and your algorithm as
it's currently written.

I'd guess that the heavy use of floating point arithmetic to calculate the step sizes means that algorithm has a fairly large constant overhead even though the complexity is smaller.

Reply via email to