On Wed, Dec 22, 2010 at 10:41 AM, Francesc Alted <fal...@pytables.org>wrote:
> NumPy version 2.0.0.dev-147f817 > There's your problem, it looks like the PYTHONPATH isn't seeing your new build for some reason. That build is off of this commit in the NumPy master branch: https://github.com/numpy/numpy/commit/147f817eefd5efa56fa26b03953a51d533cc27ec > The reason I think it might help is that with 'luf' is that it's > > calculating the expression on smaller sized arrays, which possibly > > just got buffered. If the memory allocator for the temporaries keeps > > giving back the same addresses, all this will be in one of the > > caches very close to the CPU. Unless this cache is still too slow to > > feed the SSE instructions, there should be a speed benefit. The > > ufunc inner loops could also use the SSE prefetch instructions based > > on the stride to give some strong hints about where the next memory > > bytes to use will be. > > Ah, okay. However, Numexpr is not meant to accelerate calculations with > small operands. I suppose that this is where your new iterator makes > more sense: accelerating operations where some of the operands are small > (i.e. fit in cache) and have to be broadcasted to match the > dimensionality of the others. > It's not about small operands, but small chunks of the operands at a time, with temporary arrays for intermediate calculations. It's the small chunks + temporaries which must fit in cache to get the benefit, not the whole array. The numexpr front page explains this fairly well in the section "Why It Works": http://code.google.com/p/numexpr/#Why_It_Works -Mark
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion