Hello, Just checking to see if anyone has attacked this problem before for cases where the population size is unfeasibly large. i.e. The number of categories is manageable, but the sum of the frequencies, N, precludes simple solutions such as creating a list, shuffling it and using the first n items to populate the sample (frequency distribution / histogram).
I note that numpy.random.hypergeometric will allow me to generate a sample when I only have two categories, and that I could probably implement some kind of iterative / partitioning approach calling this repeatedly. But before I do I thought I'd ask if anyone has tackled this before. Can't find much on the web. Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list