Madison May added the comment:

> What do R, SciPy, Fortran, Matlab or other statistical packages already do? 

Numpy avoids recalculating the cumulative distribution by introducing a 'size' 
argument to numpy.random.choice().  The cumulative distribution is calculated 
once, then 'size' random choices are generated and returned.

Their overall implementation is quite similar to the method suggested in the 
python docs.  

>>> choices, weights = zip(*weighted_choices)
>>> cumdist = list(itertools.accumulate(weights))
>>> x = random.random() * cumdist[-1]
>>> choices[bisect.bisect(cumdist, x)]

The addition of a 'size' argument to random.choice() has already been discussed 
(and rejected) in Issue18414, but this was on the grounds that the standard 
idiom for generating a list of random choices ([random.choice(seq) for i in 
range(k)]) is obvious and efficient.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18844>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to