Hi Chris and All, On 12 August 2011 16:53, Christopher Jordan-Squire wrote: > Hi Andrea--An easy way to get something like this would be > > import numpy as np > import scipy.stats as stats > > sigma = #some reasonable standard deviation for your application > x = stats.norm.rvs(size=1000, loc=125, scale=sigma) > x = x[x>50] > x = x[x<200] > > That will give a roughly normal distribution to your velocities, as long as, > say, sigma<25. (I'm using the rule of thumb for the normal distribution that > normal random samples lie 3 standard deviations away from the mean about 1 > out of 350 times.) Though you won't be able to get exactly normal errors > about your mean since normal random samples can theoretically be of any > size. > > You can use this same process for any other distribution, as long as you've > chosen a scale variable so that the probability of samples being outside > your desired interval is really small. Of course, once again your random > errors won't be exactly from the distribution you get your original samples > from.
Thank you for your suggestion. There are a couple of things I am not clear with, however. The first one (the easy one), is: let's suppose I need 200 values, and the accept/discard procedure removes 5 of them from the list. Is there any way to draw these 200 values from a bigger sample so that the accept/reject procedure will not interfere too much? And how do I get 200 values out of the bigger sample so that these values are still representative? Another thing, possibly completely unrelated. I am trying to design a toy Latin Hypercube script (just for my own understanding). I found this piece of code on the web (and I modified it slightly): def lhs(dist, size=100): ''' Latin Hypercube sampling of any distrbution. dist is is a scipy.stats random number generator such as stats.norm, stats.beta, etc parms is a tuple with the parameters needed for the specified distribution. :Parameters: - `dist`: random number generator from scipy.stats module. - `size` :size for the output sample ''' n = size perc = numpy.arange(0.0, 1.0, 1.0/n) numpy.random.shuffle(perc) smp = [stats.uniform(i,1.0/n).rvs() for i in perc] v = dist.ppf(smp) return v Now, I am not 100% clear of what the percent point function is (I have read around the web, but please keep in mind that my statistical skills are close to minus infinity). From this page: http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm I gather that, if you plot the results of the ppf, with the horizontal axis as probability, the vertical axis goes from the smallest to the largest value of the cumulative distribution function. If i do this: numpy.random.seed(123456) distribution = stats.norm(loc=125, scale=25) my_lhs = lhs(distribution, 50) Will my_lhs always contain valid values (i.e., included between 50 and 200)? I assume the answer is no... but even if this was the case, is this my_lhs array ready to be used to setup a LHS experiment when I have multi-dimensional problems (in which all the variables are completely independent from each other - no correlation)? My apologies for the idiocy of the questions. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://xoomer.alice.it/infinity77/ >>> import PyQt4.QtGui Traceback (most recent call last): File "<interactive input>", line 1, in <module> ImportError: No module named PyQt4.QtGui >>> >>> import pygtk Traceback (most recent call last): File "<interactive input>", line 1, in <module> ImportError: No module named pygtk >>> >>> import wx >>> >>> _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion