On Mon, May 25, 2009 at 4:59 AM, Andrew Friedley <afrie...@indiana.edu>wrote:
> For some reason the list seems to occasionally drop my messages... > > Francesc Alted wrote: > > A Friday 22 May 2009 13:52:46 Andrew Friedley escrigué: > >> I'm the student doing the project. I have a blog here, which contains > >> some initial performance numbers for a couple test ufuncs I did: > >> > >> http://numcorepy.blogspot.com > > >> Another alternative we've talked about, and I (more and more likely) may > >> look into is composing multiple operations together into a single ufunc. > >> Again the main idea being that memory accesses can be > reduced/eliminated. > > > > IMHO, composing multiple operations together is the most promising venue > for > > leveraging current multicore systems. > > Agreed -- our concern when considering for the project was to keep the > scope reasonable so I can complete it in the GSoC timeframe. If I have > time I'll definitely be looking into this over the summer; if not later. > > > Another interesting approach is to implement costly operations (from the > point > > of view of CPU resources), namely, transcendental functions like sin, cos > or > > tan, but also others like sqrt or pow) in a parallel way. If besides, > you can > > combine this with vectorized versions of them (by using the well spread > SSE2 > > instruction set, see [1] for an example), then you would be able to > achieve > > really good results for sure (at least Intel did with its VML library ;) > > > > [1] http://gruntthepeon.free.fr/ssemath/ > > I've seen that page before. Using another source [1] I came up with a > quick/dirty cos ufunc. Performance is crazy good compared to NumPy > (100x); see the latest post on my blog for a little more info. I'll > look at the source myself when I get time again, but is NumPy using a > Python-based cos function, a C implementation, or something else? As I > wrote in my blog, the performance gain is almost too good to believe. > Numpy uses the C library version. If long double and float aren't available the double version is used with number conversions, but that shouldn't give a factor of 100x. Something else is going on. Chuck
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion