On Fri, Sep 21, 2007 at 06:40:52PM -0600, Charles R Harris wrote: > def triplet(n) : > out = [] > for i in xrange(2,n) : > for j in xrange(1,i) : > for k in xrange(0,j) : > out.append((i,j,k)) > return out
I need quadruplets, numbers going up first to 120 then to 200. I tried this, but I can't even go to 120, I simply flood my system (1GB of RAM, AMD64) by using all my RAM. > Which isn't too bad for 161700 combinations. However, I still prefer the > generator form if you want to save memory for large n. > In [2]: def triplet(n) : > ...: for i in xrange(2,n) : > ...: for j in xrange(1,i) : > ...: for k in xrange(1,j) : > ...: yield i,j,k That's the way we where doing it, but you really want to build arrays when doing this, because the operations afterwards take ages. Maybe I could build arrays by chunk of say 10^6. And keep itering until I run out. > In [29]: def combination(n,t) : > ....: c = arange(t + 2) > ....: c[-1] = 0 > ....: c[-2] = n > ....: while 1 : > ....: yield c[:t] > ....: j = 0 > ....: while c[j] + 1 == c[j+1] : > ....: c[j] = j > ....: j += 1 > ....: if j >= t : > ....: return > ....: c[j] += 1 I didn't try that one... Wow, that one is pretty good ! 35s for quadruplets going up to 120, 270s going up to 200s. I can use something like itertools.groupby to build arrays by grouping the results. I have implementations in C with weave inline that work pretty well: * For numbers up to 120, this brute force approach is just fine: ############################################################################## def unique_triplets(size): """ Returns the arrays all the possible unique combinations of four integers below size.""" C_code = """ int index = 0; for (int j=0; j<size; j++) { for (int k=0; k<j+1; k++) { for (int l=0; l<k+1; l++) { nplets(index, 0) = j; nplets(index, 1) = k; nplets(index, 2) = l; index++ ; } } } """ multiset_coef = round(factorial(size+3-1)/(factorial(size-1)*factorial(3))) nplets = empty((multiset_coef, 3), dtype=uint32) inline(C_code, ['nplets', 'size'], type_converters=converters.blitz) return nplets 368ms for numbers up to 120. * For numbers higher, the resulting array does not fit in the memory, so I have to break it up: ############################################################################## def generate_fourplets(size): """ Returns an iterator on tables listing all the possible unique combinations of four integers below size. """ C_code = """ int index = 0; for (int j=0; j<i+1; j++) { for (int k=0; k<j+1; k++) { for (int l=0; l<k+1; l++) { nplets(index, 0) = i; nplets(index, 1) = j; nplets(index, 2) = k; nplets(index, 3) = l; index++ ; } } } """ for i in xrange(size): multiset_coef = round(factorial(i+3)/(factorial(i)*factorial(3))) nplets = empty((multiset_coef, 4), dtype=uint32) inline(C_code, ['nplets', 'i'], type_converters=converters.blitz) yield nplets I can't test this up to n=200 because round(factorial(i+3)/(factorial(i)*factorial(3))) overflows. Is there a way to calculate the binomial coef with ints only, so as to avoid the overflows. For 200 the result is 1 373 701, by the way, so it is not unreasonable to create an array of such size. I guess I could simply generate an array too large, and cut it using the "index" calculated in C. It simply looks too ugly to me. I can implement the L algorithm in C to go quicker, and implement a with coroutines, running the while loop 10^6 times each time, grouping the results in an array, and saving the c[n] for next time the c code is called. Where it gets ugly is at the end, because I have to keep track of the size of the array I return, as I will have to crop it. This means having a test in the Python code I fear. I would go for the "generate_fourplets" solution if I had a way to calculate the binomial coefficient without overflows. Cheers, Gaƫl _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion