On 11/7/12 8:41 PM, Neal Becker wrote:
Would you expect numexpr without MKL to give a significant boost?
Yes. Have a look at how numexpr's own multi-threaded virtual machine
compares with numexpr using VML:
http://code.google.com/p/numexpr/wiki/NumexprVML
As it can be seen, the best results
On 11/8/12 12:35 AM, Chris Barker wrote:
On Wed, Nov 7, 2012 at 11:41 AM, Neal Becker ndbeck...@gmail.com wrote:
Would you expect numexpr without MKL to give a significant boost?
It can, depending on the use case:
-- It can remove a lot of uneccessary temporary creation.
-- IIUC, it works
On Wed, Nov 7, 2012 at 10:41 PM, Nicolas SCHEFFER
scheffer.nico...@gmail.com wrote:
Hi,
I've written a snippet of code that we could call scipy.dot, a drop-in
replacement for numpy.dot.
It's dead easy, and just answer the need of calling the right blas
function depending on the type of
On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote:
I think everyone would be very happy to see numpy.dot modified to do
this automatically. But adding a scipy.dot IMHO would be fixing things
in the wrong place and just create extra confusion.
I am not sure I agree: numpy is often
On 11/08/2012 01:07 PM, Gael Varoquaux wrote:
On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote:
I think everyone would be very happy to see numpy.dot modified to do
this automatically. But adding a scipy.dot IMHO would be fixing things
in the wrong place and just create extra
I'm interested in trying numexpr, but have a question (not sure where's the
best
forum to ask).
The examples I see use
ne.evaluate (some string...)
When used within a loop, I would expect the compilation from the string form to
add significant overhead. I would have thought a pre-compiled
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you expect numexpr without MKL to give a significant boost?
If you need higher performance than what numexpr can give without using
MKL, you could look at code such as this:
https://github.com/herumi/fmath/blob/master/fmath.hpp#L480
But that
On Thu, Nov 8, 2012 at 12:12 PM, Dag Sverre Seljebotn
d.s.seljeb...@astro.uio.no wrote:
On 11/08/2012 01:07 PM, Gael Varoquaux wrote:
On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote:
I think everyone would be very happy to see numpy.dot modified to do
this automatically. But
On 11/8/12 1:37 PM, Neal Becker wrote:
I'm interested in trying numexpr, but have a question (not sure where's the
best
forum to ask).
The examples I see use
ne.evaluate (some string...)
When used within a loop, I would expect the compilation from the string form
to
add significant
On Thu, Nov 8, 2012 at 2:22 AM, Francesc Alted franc...@continuum.io wrote:
-- It can remove a lot of uneccessary temporary creation.
Well, the temporaries are still created, but the thing is that, by
working with small blocks at a time, these temporaries fit in CPU cache,
preventing
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote:
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you expect numexpr without MKL to give a significant boost?
If you need higher performance than what numexpr can give without using
MKL, you could look at code such as this:
On Thu, Nov 8, 2012 at 7:06 AM, David Cournapeau courn...@gmail.com wrote:
On Thu, Nov 8, 2012 at 12:12 PM, Dag Sverre Seljebotn
d.s.seljeb...@astro.uio.no wrote:
On 11/08/2012 01:07 PM, Gael Varoquaux wrote:
On Thu, Nov 08, 2012 at 11:28:21AM +, Nathaniel Smith wrote:
I think
On 11/08/2012 06:06 PM, Francesc Alted wrote:
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote:
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you expect numexpr without MKL to give a significant boost?
If you need higher performance than what numexpr can give without using
MKL, you could
Hi,
I also think it should go into numpy.dot and that the output order should
not be changed.
A new point, what about the additional overhead for small ndarray? To
remove this, I would suggest to put this code into the C function that do
the actual work (at least, from memory it is a c function,
On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote:
On 11/08/2012 06:06 PM, Francesc Alted wrote:
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote:
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you expect numexpr without MKL to give a significant boost?
If you need higher performance than what
On 11/08/2012 06:59 PM, Francesc Alted wrote:
On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote:
On 11/08/2012 06:06 PM, Francesc Alted wrote:
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote:
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you expect numexpr without MKL to give a significant
On 11/08/2012 07:55 PM, Dag Sverre Seljebotn wrote:
On 11/08/2012 06:59 PM, Francesc Alted wrote:
On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote:
On 11/08/2012 06:06 PM, Francesc Alted wrote:
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote:
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you
On 11/8/12 7:55 PM, Dag Sverre Seljebotn wrote:
On 11/08/2012 06:59 PM, Francesc Alted wrote:
On 11/8/12 6:38 PM, Dag Sverre Seljebotn wrote:
On 11/08/2012 06:06 PM, Francesc Alted wrote:
On 11/8/12 1:41 PM, Dag Sverre Seljebotn wrote:
On 11/07/2012 08:41 PM, Neal Becker wrote:
Would you
Thanks for all the responses folks. This is indeed a nice problem to solve.
Few points:
I. Change the order from 'F' to 'C': I'll look into it.
II. Integration with scipy / numpy: opinions are diverging here.
Let's wait a bit to get more responses on what people think.
One thing though: I'd need
I've made the necessary changes to get the proper order for the output array.
Also, a pass of pep8 and some tests (fixmes are in failing tests)
http://pastebin.com/M8TfbURi
-n
On Thu, Nov 8, 2012 at 11:38 AM, Nicolas SCHEFFER
scheffer.nico...@gmail.com wrote:
Thanks for all the responses folks.
Well, hinted by what Fabien said, I looked at the C level dot function.
Quite verbose!
But starting line 757, we can see that it shouldn't be too much work
to fix that bug (well there is even a comment there that states just
that)
Hey,
On Thu, 2012-11-08 at 14:44 -0800, Nicolas SCHEFFER wrote:
Well, hinted by what Fabien said, I looked at the C level dot function.
Quite verbose!
But starting line 757, we can see that it shouldn't be too much work
to fix that bug (well there is even a comment there that states just
On Fri, 2012-11-09 at 00:24 +0100, Sebastian Berg wrote:
Hey,
On Thu, 2012-11-08 at 14:44 -0800, Nicolas SCHEFFER wrote:
Well, hinted by what Fabien said, I looked at the C level dot function.
Quite verbose!
But starting line 757, we can see that it shouldn't be too much work
to fix
Thanks Sebastien, didn't think of that.
Well I went ahead and tried the change, and it's indeed straightforward.
I've run some tests, among which:
nosetests numpy/numpy/core/tests/test_blasdot.py
and it looks ok. I'm assuming this is good news.
I've copy-pasting the diff below, but I have that
Hi,
I suspect the current tests are not enought. You need to test all the
combination for the 3 inputs with thoses strides:
c-contiguous
f-contiguous
something else like strided.
Also, try with matrix with shape of 1 in each dimensions. Not all blas
libraries accept the strides that numpy use
Fred,
Thanks for the advice.
The code will only affect the part in _dotblas.c where gemm is called.
There's tons of check before that make sure both matrices are of ndim 2.
We should check though if we can do these tricks in other parts of the function.
Otherwise:
- I've built against ATLAS 3.10
26 matches
Mail list logo