Robert Kern <robert.kern <at> gmail.com> writes: > On Thu, Sep 18, 2008 at 16:55, Paul Moore <pf_moore <at> yahoo.co.uk> wrote: > > I want to generate a series of random samples, to do simulations based > > on them. Essentially, I want to be able to produce a SAMPLESIZE * N > > matrix, where each row of N values consists of either > > > > 1. Integers between 1 and M (simulating M rolls of an N-sided die), or > > 2. A sample of N numbers between 1 and M without repeats (simulating > > deals of N cards from an M-card deck). > > > > Example (1) is easy, numpy.random.random_integers(1, M, (SAMPLESIZE, N)) > > > > But I can't find an obvious equivalent for (2). Am I missing something > > glaringly obvious? I'm using numpy - is there maybe something in scipy I > > should be looking at? > > numpy.array([(numpy.random.permutation(M) + 1)[:N] > for i in range(SAMPLESIZE)]) >
Thanks. And yet, this takes over 70s and peaks at around 400M memory use, whereas the equivalent for (1) numpy.random.random_integers(1,M,(SAMPLESIZE,N)) takes less than half a second, and negligible working memory (both end up allocating an array of the same size, but your suggestion consumes temporary working memory - I suspect, but can't prove, that the time taken comes from memory allocations rather than computation. As a one-off cost initialising my data, it's not a disaster, but I anticipate using idioms like this later in my calculations as well, where the costs could hurt more. If I'm going to need to write C code, are there any good examples of this? (I guess the source for numpy.random is a good place to start). Paul _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion