Re: [Numpy-discussion] combinations anomaly
On 9/22/07, Charles R Harris <[EMAIL PROTECTED]> wrote: > > > > On 9/22/07, Alan G Isaac <[EMAIL PROTECTED]> wrote: > > > > Charles harris posted a generator function for generating > > combinations (based on Knuth's fascicle). I get the > > expected results by iterating over the resulting generator, > > but not if I let ``list`` do that for me. What is more, > > changing ``arange`` to ``range`` in the code eliminates > > this anomaly. > > > > What am I failing to understand here? > > > > Thank you, > > Alan Isaac > > > There are a couple of potential problems if you want to make a list. > Because an array view is returned, and the data is updated in the loop, all > the views will end up with the same content. I used arrays and views for > speed. To fix that, you need to return a copy, i.e., yield c[:t].copy(). > That way you will end up with a list of arrays. If you do yield list(c[:t]), > you will get a list of lists. Or, you can do as you did and just use range > instead of arange because a slice of a list returns a copy. I admit I'm > curious about the speed of the two approaches, lists may be faster than > arrays. Lets see combination returns array copies, combinaion1 uses > range. > > In [7]: time for i in combination(100,3) : pass > CPU times: user 0.89 s, sys: 0.07 s, total: 0.96 s > Wall time: 0.96 > > In [8]: time for i in combination1(100,3) : pass > CPU times: user 0.17 s, sys: 0.01 s, total: 0.18 s > Wall time: 0.18 > > Wow, massive speed up using lists, almost as fast as nested loops. I > expect lists benefit from faster indexing and faster creation. I think your > range fix is the way to go. Things slow down a bit for the full list > treatment, but not too much: > > In [13]: time a = list(combination1(100,3)) > CPU times: user 0.26 s, sys: 0.00 s, total: 0.27 s > Wall time: 0.27 > > In [14]: time a = [i for i in combination1(100,3)] > CPU times: user 0.35 s, sys: 0.01 s, total: 0.36 s > Wall time: 0.36 > It's even faster in C++ In [1]: from _combination import combination In [2]: time a = combination(100,3) CPU times: user 0.09 s, sys: 0.01 s, total: 0.09 s Wall time: 0.09 That's for the whole nested list. Arrays would probably be faster in this case because I am calling the python functions to add objects to the lists. Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] combinations anomaly
On Sat, 22 Sep 2007, Charles R Harris apparently wrote: > an array view is returned, and the data is updated in the > loop ... I think your range fix is the way to go. Got it. Thanks! Alan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] combinations anomaly
On 9/22/07, Alan G Isaac <[EMAIL PROTECTED]> wrote: > > Charles harris posted a generator function for generating > combinations (based on Knuth's fascicle). I get the > expected results by iterating over the resulting generator, > but not if I let ``list`` do that for me. What is more, > changing ``arange`` to ``range`` in the code eliminates > this anomaly. > > What am I failing to understand here? > > Thank you, > Alan Isaac There are a couple of potential problems if you want to make a list. Because an array view is returned, and the data is updated in the loop, all the views will end up with the same content. I used arrays and views for speed. To fix that, you need to return a copy, i.e., yield c[:t].copy(). That way you will end up with a list of arrays. If you do yield list(c[:t]), you will get a list of lists. Or, you can do as you did and just use range instead of arange because a slice of a list returns a copy. I admit I'm curious about the speed of the two approaches, lists may be faster than arrays. Lets see combination returns array copies, combinaion1 uses range. In [7]: time for i in combination(100,3) : pass CPU times: user 0.89 s, sys: 0.07 s, total: 0.96 s Wall time: 0.96 In [8]: time for i in combination1(100,3) : pass CPU times: user 0.17 s, sys: 0.01 s, total: 0.18 s Wall time: 0.18 Wow, massive speed up using lists, almost as fast as nested loops. I expect lists benefit from faster indexing and faster creation. I think your range fix is the way to go. Things slow down a bit for the full list treatment, but not too much: In [13]: time a = list(combination1(100,3)) CPU times: user 0.26 s, sys: 0.00 s, total: 0.27 s Wall time: 0.27 In [14]: time a = [i for i in combination1(100,3)] CPU times: user 0.35 s, sys: 0.01 s, total: 0.36 s Wall time: 0.36 Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] combinations anomaly
Charles harris posted a generator function for generating combinations (based on Knuth's fascicle). I get the expected results by iterating over the resulting generator, but not if I let ``list`` do that for me. What is more, changing ``arange`` to ``range`` in the code eliminates this anomaly. What am I failing to understand here? Thank you, Alan Isaac example ### def combinations(n,t): c = arange(t + 2) c[-2] = n c[-1] = 0 while 1: yield c[:t] j = 0 while c[j] + 1 == c[j+1] : c[j] = j j += 1 if j >= t : return c[j] += 1 print "I get the expected results by iterating:" for tpl in combinations(5,3): print tpl print "However list conversion fails:" print list(combinations(5,3)) # end example ### ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extracting all the possible combinations of a grid
On Sat, Sep 22, 2007 at 09:58:33AM -0600, Charles R Harris wrote: >Umm... that doesn't look quite right. Shouldn't it be something like Puzzling. My implementation works, as far as I could test it, which I did as much as I could. Maybe the two are equivalent. >Algorithm L can be chunked pretty easily also: Yes, I think this would scale even better for really big sets, as the size of the chunk can be more easily controlled. The good think is that with chunks like this the calculation is fully "local", so if the result of the calculation does not depend on the number of points (eg its a sum of the calculation for eache point) then the calculation can go on for ever, there is no limit to the number of points other than the patience of the operator. So if the algorythm is right, they can leave it running for two days, and get great results for a paper. Thanks for your input, Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extracting all the possible combinations of a grid
On 9/22/07, Gael Varoquaux <[EMAIL PROTECTED]> wrote: > > On Sat, Sep 22, 2007 at 10:35:16AM +0200, Gael Varoquaux wrote: > > I would go for the "generate_fourplets" solution if I had a way to > > calculate the binomial coefficient without overflows. > > Sorry, premature optimisation is the root of all evil, but turning ones > brain on early is good. > > """ > > ## > # Some routines for calculation of binomial coefficients > def gcd(m,n): > while n: > m,n=n,m%n > return m > > def binom_(n,k): > if k==0: > return 1 > else: > g = gcd(n,k) > return binomial(n-1, k-1)/(k/g)*(n/g) > > def binomial(n,k): > if k > n/2: # Limit recursion depth > k=n-k > return binom_(n,k) > """ > > This is surprisingly fast (surprising for me, at least). > > Using this and the C code I have, I can generate the quadruplets of 200 > integers quite quickly: > > In [5]: %timeit b = [1 for i in generate_quadruplets(200)] > 10 loops, best of 3: 1.61 s per loop > > With generate_quadruplets given by: > > """ > > ## > def generate_quadruplets(size): > """ Returns an iterator on tables listing all the possible unique > combinations of four integers below size. """ > > C_code = """ > int index = 0; > for (int j=0; j for (int k=0; k for (int l=0; l quadruplets(index, 0) = i; > quadruplets(index, 1) = j; > quadruplets(index, 2) = k; > quadruplets(index, 3) = l; > index++ ; > } > } > } > """ > > for i in xrange(size): > multiset_coef = binomial(i+3, 3) > quadruplets = empty((multiset_coef, 4), dtype=uint32) > inline(C_code, ['quadruplets', 'i'], > type_converters=converters.blitz) > > yield quadruplets > """ Umm... that doesn't look quite right. Shouldn't it be something like def generate_quadruplets(size): """ Returns an iterator on tables listing all the possible unique combinations of four integers below size. """ C_code = """ int index = 0; for (int j=2; j= t : break c[j] += 1 yield out[:i+1] if j >= t : return I think this would go well as a C++ function object. The python wrapper would look something like def combination(n,t,chunk) : next = cpp_combination(n,t,chunk) while 1 : out = next() if len(out) > 0 : yield out else : return Chuck ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ?
On 9/17/07, Matthew Brett <[EMAIL PROTECTED]> wrote: > Hi, > > >Starting thinking over the whole distutils thing, I was thinking > > what people would think about using scons inside distutils to build > > extension. > > In general this seems like an excellent idea. If we can contribute > what we need to scons, that would greatly ease the burden of > maintenance, and benefit both projects. The key problem will be > support. At the moment Pearu maintains and owns numpy.distutils. > Will we have the same level of commitment and support for this > alternative do you think? > > How easy would it be to throw up a prototype for the rest of us to > look at and get a feel for what the benefits would be? Here we are for those interested: all the work is in the branch numpy.scons. Basically, I added a command scons to numpy.distutils, and add a add_sconscript hook to the config class. An example can be found in numpy/scons_fake. You are expected to do python setup.py scons To build them (I have not found a way yet to tell the install command to call scons command; but the current implementation put the built code where distutils expects it, so if you call scons and then install, it should be ok). This is only a proof of concept, but as an example, I started to implement a small support library for sconscript to be used inside numpy, and got ctypes extension building working on both windows and linux. The Sconscript file: # vim:syntax=python from numpy.distutils.scons import GetNumpyEnvironment env = GetNumpyEnvironment(ARGUMENTS) config.CheckHeader('stdio.h') config.CheckLib('c', 'printf') config.Finish() source = ['foo.c'] import sys if sys.platform == 'win32': env.AppendUnique(CPPDEFINES = 'WIN32') env.NumpyCTypes('foo', source) And in the setup.py, you just do: config.add_sconscript('SConstruct') Once numpy with scons support is installed, partial build is possible (by partial I mean only using python setup.py scons in a subpackage). This should give a feel on how it would behave for users. If people think this is the right direction, then I can add more complete support (mostly making scons behave as distutils with respect to how to find extension using setup.cfg, implementing the checks in numpy/distutils/system_info for scons, passing the compilers/linkers to scons, etc...). cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extracting all the possible combinations of a grid
On Sat, Sep 22, 2007 at 10:35:16AM +0200, Gael Varoquaux wrote: > I would go for the "generate_fourplets" solution if I had a way to > calculate the binomial coefficient without overflows. Sorry, premature optimisation is the root of all evil, but turning ones brain on early is good. """ ## # Some routines for calculation of binomial coefficients def gcd(m,n): while n: m,n=n,m%n return m def binom_(n,k): if k==0: return 1 else: g = gcd(n,k) return binomial(n-1, k-1)/(k/g)*(n/g) def binomial(n,k): if k > n/2: # Limit recursion depth k=n-k return binom_(n,k) """ This is surprisingly fast (surprising for me, at least). Using this and the C code I have, I can generate the quadruplets of 200 integers quite quickly: In [5]: %timeit b = [1 for i in generate_quadruplets(200)] 10 loops, best of 3: 1.61 s per loop With generate_quadruplets given by: """ ## def generate_quadruplets(size): """ Returns an iterator on tables listing all the possible unique combinations of four integers below size. """ C_code = """ int index = 0; for (int j=0; jhttp://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Extracting all the possible combinations of a grid
On Fri, Sep 21, 2007 at 06:40:52PM -0600, Charles R Harris wrote: >def triplet(n) : >out = [] >for i in xrange(2,n) : >for j in xrange(1,i) : >for k in xrange(0,j) : >out.append((i,j,k)) >return out I need quadruplets, numbers going up first to 120 then to 200. I tried this, but I can't even go to 120, I simply flood my system (1GB of RAM, AMD64) by using all my RAM. >Which isn't too bad for 161700 combinations. However, I still prefer the >generator form if you want to save memory for large n. >In [2]: def triplet(n) : > ...: for i in xrange(2,n) : > ...: for j in xrange(1,i) : > ...: for k in xrange(1,j) : > ...: yield i,j,k That's the way we where doing it, but you really want to build arrays when doing this, because the operations afterwards take ages. Maybe I could build arrays by chunk of say 10^6. And keep itering until I run out. >In [29]: def combination(n,t) : > : c = arange(t + 2) > : c[-1] = 0 > : c[-2] = n > : while 1 : > : yield c[:t] > : j = 0 > : while c[j] + 1 == c[j+1] : > : c[j] = j > : j += 1 > : if j >= t : > : return > : c[j] += 1 I didn't try that one... Wow, that one is pretty good ! 35s for quadruplets going up to 120, 270s going up to 200s. I can use something like itertools.groupby to build arrays by grouping the results. I have implementations in C with weave inline that work pretty well: * For numbers up to 120, this brute force approach is just fine: ## def unique_triplets(size): """ Returns the arrays all the possible unique combinations of four integers below size.""" C_code = """ int index = 0; for (int j=0; jhttp://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] New scipy 0.6.0 uploads
There is a new scipy-0.6.0.tar.gz on the sourceforge page, which contains the missing scipy/linalg/src/fblaswrap_veclib_c.c. There is also now a scipy-0.6.0-py2.4-win32.egg and scipy-0.6.0-py2.5-win32.egg. Enjoy, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Compilation failure on python2.4 win32
On Fri, September 21, 2007 10:23 pm, Jörgen Stenarson wrote: > Travis E. Oliphant skrev: >> Jörgen Stenarson wrote: >>> Hi, >>> >>> I cannot compile numpy (rev 2042) for python2.4 on win32, it works on >>> python2.5. It looks like the call to function get_build_architecture in >>> distutils.misc_util.py is python2.5 specific. >>> >> Yes. This needs to be fixed. I'll do it. >> >> Can you try the current trunk? >> > > I still see problems > >File "numpy\core\setup.py", line 109, in generate_config_h > from distutils.msvccompiler import get_build_architecture > ImportError: cannot import name get_build_architecture Try again the latest trunk. Pearu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion