On Fri, Aug 21, 2009 at 2:51 PM, Matthew Brett<matthew.br...@gmail.com> wrote: > I can imagine Numpy being useful for scripting in this > C-and-assembler-centric world, making it easier to write automated > testers, or even generate C code. > > Is anyone out there working on this kind of stuff? I ask only because > there seems to be considerable interest here on the Berkeley campus. > > Best, > > Matthew
Frederic Bastien and I are working on this sort of thing. We use a project called theano to build symbolic expression graphs. Theano optimizes those graphs like an optimizing compiler, and then it generates C code for those graphs. We haven't put a lot of effort into optimizing the C implementations of most expressions (except for non-separable convolution), but we call fast blas and fftw functions, and our naive implementations are typically faster than equivalent numpy expressions just because they are in C. (Although congrats to those working at optimizing numpy... it has gotten a lot faster over the last few years!) We are now writing another backend that generates cuda runtime C++. It is just like you say: even for simple tasks like adding two vectors together or summing the elements of a matrix, there are several possible kernels that can be optimal in different circumstances. The penalty of choosing a sub-optimal kernel can be pretty high. So what ends up happening is that even for simple ufunc-type expressions, we have - a version for when the arguments are small and everything is c-contiguous - a general version that is typically orders of magnitude slower than the optimal choice - versions for when arguments are small and 1D, 2D, 3D, 4D, 5D - versions for when various of the arguments are broadcasted in different ways - versions for when there is at least one large contiguous dimension And the list goes on. We are still in the process of understanding the architecture and the most effective strategies for optimization. I think our design is a good one though from the users' perspective because it supports a completely opaque front-end.. you just program the symbolic graph in python using normal expressions, compile it as a function, and call it. The detail of whether it is evaluated on the CPU or the GPU (or both) is hidden. If anyone is interested in what we're doing please feel free to send me an email. Links to these projects are http://www.pylearn.org/theano http://code.google.com/p/theano-cuda-ndarray/ http://code.google.com/p/cuda-ndarray/ James -- http://www-etud.iro.umontreal.ca/~bergstrj _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion