Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Francesc Alted
On 11/7/12 8:41 PM, Neal Becker wrote: Would you expect numexpr without MKL to give a significant boost? Yes. Have a look at how numexpr's own multi-threaded virtual machine compares with numexpr using VML: http://code.google.com/p/numexpr/wiki/NumexprVML As it can be seen, the best results

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Francesc Alted
On 11/8/12 12:35 AM, Chris Barker wrote: On Wed, Nov 7, 2012 at 11:41 AM, Neal Becker ndbeck...@gmail.com wrote: Would you expect numexpr without MKL to give a significant boost? It can, depending on the use case: -- It can remove a lot of uneccessary temporary creation. -- IIUC, it works

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nathaniel Smith
On Wed, Nov 7, 2012 at 10:41 PM, Nicolas SCHEFFER scheffer.nico...@gmail.com wrote: Hi, I've written a snippet of code that we could call scipy.dot, a drop-in replacement for numpy.dot. It's dead easy, and just answer the need of calling the right blas function depending on the type of

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Gael Varoquaux
On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote: I think everyone would be very happy to see numpy.dot modified to do this automatically. But adding a scipy.dot IMHO would be fixing things in the wrong place and just create extra confusion. I am not sure I agree: numpy is often

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Dag Sverre Seljebotn
On 11/08/2012 01:07 PM, Gael Varoquaux wrote: On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote: I think everyone would be very happy to see numpy.dot modified to do this automatically. But adding a scipy.dot IMHO would be fixing things in the wrong place and just create extra

[Numpy-discussion] numexpr question

2012-11-08 Thread Neal Becker
I'm interested in trying numexpr, but have a question (not sure where's the best forum to ask). The examples I see use ne.evaluate (some string...) When used within a loop, I would expect the compilation from the string form to add significant overhead. I would have thought a pre-compiled

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Dag Sverre Seljebotn
On 11/07/2012 08:41 PM, Neal Becker wrote: Would you expect numexpr without MKL to give a significant boost? If you need higher performance than what numexpr can give without using MKL, you could look at code such as this: https://github.com/herumi/fmath/blob/master/fmath.hpp#L480 But that

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread David Cournapeau
On Thu, Nov 8, 2012 at 12:12 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 11/08/2012 01:07 PM, Gael Varoquaux wrote: On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote: I think everyone would be very happy to see numpy.dot modified to do this automatically. But

Re: [Numpy-discussion] numexpr question

2012-11-08 Thread Francesc Alted
On 11/8/12 1:37 PM, Neal Becker wrote: I'm interested in trying numexpr, but have a question (not sure where's the best forum to ask). The examples I see use ne.evaluate (some string...) When used within a loop, I would expect the compilation from the string form to add significant

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Chris Barker
On Thu, Nov 8, 2012 at 2:22 AM, Francesc Alted franc...@continuum.io wrote: -- It can remove a lot of uneccessary temporary creation. Well, the temporaries are still created, but the thing is that, by working with small blocks at a time, these temporaries fit in CPU cache, preventing

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Francesc Alted
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote: On 11/07/2012 08:41 PM, Neal Becker wrote: Would you expect numexpr without MKL to give a significant boost? If you need higher performance than what numexpr can give without using MKL, you could look at code such as this:

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Anthony Scopatz
On Thu, Nov 8, 2012 at 7:06 AM, David Cournapeau courn...@gmail.com wrote: On Thu, Nov 8, 2012 at 12:12 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 11/08/2012 01:07 PM, Gael Varoquaux wrote: On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote: I think

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Dag Sverre Seljebotn
On 11/08/2012 06:06 PM, Francesc Alted wrote: On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote: On 11/07/2012 08:41 PM, Neal Becker wrote: Would you expect numexpr without MKL to give a significant boost? If you need higher performance than what numexpr can give without using MKL, you could

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Frédéric Bastien
Hi, I also think it should go into numpy.dot and that the output order should not be changed. A new point, what about the additional overhead for small ndarray? To remove this, I would suggest to put this code into the C function that do the actual work (at least, from memory it is a c function,

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Francesc Alted
On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote: On 11/08/2012 06:06 PM, Francesc Alted wrote: On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote: On 11/07/2012 08:41 PM, Neal Becker wrote: Would you expect numexpr without MKL to give a significant boost? If you need higher performance than what

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Dag Sverre Seljebotn
On 11/08/2012 06:59 PM, Francesc Alted wrote: On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote: On 11/08/2012 06:06 PM, Francesc Alted wrote: On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote: On 11/07/2012 08:41 PM, Neal Becker wrote: Would you expect numexpr without MKL to give a significant

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Dag Sverre Seljebotn
On 11/08/2012 07:55 PM, Dag Sverre Seljebotn wrote: On 11/08/2012 06:59 PM, Francesc Alted wrote: On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote: On 11/08/2012 06:06 PM, Francesc Alted wrote: On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote: On 11/07/2012 08:41 PM, Neal Becker wrote: Would you

Re: [Numpy-discussion] testing with amd libm/acml

2012-11-08 Thread Francesc Alted
On 11/8/12 7:55 PM, Dag Sverre Seljebotn wrote: On 11/08/2012 06:59 PM, Francesc Alted wrote: On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote: On 11/08/2012 06:06 PM, Francesc Alted wrote: On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote: On 11/07/2012 08:41 PM, Neal Becker wrote: Would you

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
Thanks for all the responses folks. This is indeed a nice problem to solve. Few points: I. Change the order from 'F' to 'C': I'll look into it. II. Integration with scipy / numpy: opinions are diverging here. Let's wait a bit to get more responses on what people think. One thing though: I'd need

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
I've made the necessary changes to get the proper order for the output array. Also, a pass of pep8 and some tests (fixmes are in failing tests) http://pastebin.com/M8TfbURi -n On Thu, Nov 8, 2012 at 11:38 AM, Nicolas SCHEFFER scheffer.nico...@gmail.com wrote: Thanks for all the responses folks.

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
Well, hinted by what Fabien said, I looked at the C level dot function. Quite verbose! But starting line 757, we can see that it shouldn't be too much work to fix that bug (well there is even a comment there that states just that)

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Sebastian Berg
Hey, On Thu, 2012-11-08 at 14:44 -0800, Nicolas SCHEFFER wrote: Well, hinted by what Fabien said, I looked at the C level dot function. Quite verbose! But starting line 757, we can see that it shouldn't be too much work to fix that bug (well there is even a comment there that states just

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Sebastian Berg
On Fri, 2012-11-09 at 00:24 +0100, Sebastian Berg wrote: Hey, On Thu, 2012-11-08 at 14:44 -0800, Nicolas SCHEFFER wrote: Well, hinted by what Fabien said, I looked at the C level dot function. Quite verbose! But starting line 757, we can see that it shouldn't be too much work to fix

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
Thanks Sebastien, didn't think of that. Well I went ahead and tried the change, and it's indeed straightforward. I've run some tests, among which: nosetests numpy/numpy/core/tests/test_blasdot.py and it looks ok. I'm assuming this is good news. I've copy-pasting the diff below, but I have that

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Frédéric Bastien
Hi, I suspect the current tests are not enought. You need to test all the combination for the 3 inputs with thoses strides: c-contiguous f-contiguous something else like strided. Also, try with matrix with shape of 1 in each dimensions. Not all blas libraries accept the strides that numpy use

Re: [Numpy-discussion] Scipy dot

2012-11-08 Thread Nicolas SCHEFFER
Fred, Thanks for the advice. The code will only affect the part in _dotblas.c where gemm is called. There's tons of check before that make sure both matrices are of ndim 2. We should check though if we can do these tricks in other parts of the function. Otherwise: - I've built against ATLAS 3.10