Christopher Barker wrote: > George Dahl wrote: >> Sturla Molden <sturla <at> molden.no> writes: >>> Teraflops peak performance of modern GPUs is impressive. But NumPy >>> cannot easily benefit from that. > >> I know that for my work, I can get around an order of a 50-fold speedup over >> numpy using a python wrapper for a simple GPU matrix class. > > I think you're talking across each other here. Sturla is referring to > making a numpy ndarray gpu-aware and then expecting expressions like: > > z = a*x**2 + b*x + c > > to go faster when s, b, c, and x are ndarrays. > > That's not going to happen. > > On the other hand, George is talking about moving higher-level > operations (like a matrix product) over to GPU code. This is analogous > to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that > could help those programs that use such operations. > > So a GPU LAPACK would be nice. > > This is also analogous to using SWIG, or ctypes or cython or weave, or > ??? to move a computationally expensive part of the code over to C. > > I think anything that makes it easier to write little bits of your code > for the GPU would be pretty cool -- a GPU-aware Cython?
Cython is probably open for that if anybody's interested in implementing it/make a student project on it (way too big for GSoC I think, unfortunately). However I'd definitely make it a generic library turning expressions into compiled code (either GPU or CPU w/SSE); that could then be used both at compile-time from Cython, or at run-time using e.g. SymPy or SAGE expressions. Both PyCUDA and CorePy would tend to allow both compile-time operation and run-time operation. -- Dag Sverre _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion