Re: [Numpy-discussion] Generating random samples without repeats

Paul Moore Fri, 19 Sep 2008 08:09:12 -0700

Rick White <rlw <at> stsci.edu> writes:

> It seems like numpy.random.permutation is pretty suboptimal in its  
> speed.  Here's a Python 1-liner that does the same thing (I think)  
> but is a lot faster:
> 
> a = 1+numpy.random.rand(M).argsort()[0:N-1]
> 
> This still has the the problem that it generates a size N array to  
> start with.  But at least it is fast compared with permutation:


Interesting. For my generation of a million samples, this takes about 46 sec 
vs the original 75. That's a 35% increase in speed. As you mention, it doesn't 
help memory, which still peaks at around 450M.

Interestingly, I was reminded of J (http://www.jsoftware.com/), an APL 
derivative, which does this in a blistering 1.3 seconds, with no detectable 
memory overhead. Of course, being descended from APL, the code to do this is 
pretty obscure:

    5 ? (1000000 $ 52)

(Here, ? is the "deal" operator, and $ reshapes an array - so it's "deal 5 
from each item in a 1000000-long array of 52's". Everything is a primitive 
here, so it's not hard to see why it's fast).

A Python/Numpy <-> J bridge might be a fun exercise...

Paul.

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Generating random samples without repeats

Reply via email to