Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-17 Thread Andy Ray Terrel
On Thu, Dec 17, 2015 at 5:52 AM, Sturla Molden 
wrote:

> On 17/12/15 12:06, Francesc Alted wrote:
>
> Pretty good.  I did not know that OpenBLAS was so close in performance
>> to MKL.
>>
>
> MKL, OpenBLAS and Accelerate are very close in performance, except for
> level-1 BLAS where Accelerate and MKL are better than OpenBLAS.
>
> MKL requires the number of threads to be a multiple of four to achieve
> good performance, OpenBLAS and Accelerate do not. It e.g. matters if you
> have an online data acquisition and DSP system and want to dedicate one
> processor to take care of i/o tasks. In this case OpenBLAS and Accelerate
> are likely to perform better than MKL.
>
>
The last time I benchmarked them MKL was much better at tall skinny
matrices.


>
> Sturla
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-17 Thread Edward Richards
Thanks everyone for helping me glimpse the secret world of FORTRAN 
compilers. I am running a Linux machine, so I will look into MKL and 
openBLAS. It was easy for me to get a Intel parallel studio XE license 
as a student, so I have options.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-17 Thread Daπid
On 16 December 2015 at 18:59, Francesc Alted  wrote:

> Probably MATLAB is shipping with Intel MKL enabled, which probably is the
> fastest LAPACK implementation out there.  NumPy supports linking with MKL,
> and actually Anaconda does that by default, so switching to Anaconda would
> be a good option for you.


A free alternative is OpenBLAS. I am getting 20 s in an i7 Haswell with 8
cores.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-17 Thread Francesc Alted
2015-12-17 12:00 GMT+01:00 Daπid :

> On 16 December 2015 at 18:59, Francesc Alted  wrote:
>
>> Probably MATLAB is shipping with Intel MKL enabled, which probably is the
>> fastest LAPACK implementation out there.  NumPy supports linking with MKL,
>> and actually Anaconda does that by default, so switching to Anaconda would
>> be a good option for you.
>
>
> A free alternative is OpenBLAS. I am getting 20 s in an i7 Haswell with 8
> cores.
>

Pretty good.  I did not know that OpenBLAS was so close in performance to
MKL.

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-17 Thread Sturla Molden

On 17/12/15 12:06, Francesc Alted wrote:


Pretty good.  I did not know that OpenBLAS was so close in performance
to MKL.


MKL, OpenBLAS and Accelerate are very close in performance, except for 
level-1 BLAS where Accelerate and MKL are better than OpenBLAS.


MKL requires the number of threads to be a multiple of four to achieve 
good performance, OpenBLAS and Accelerate do not. It e.g. matters if you 
have an online data acquisition and DSP system and want to dedicate one 
processor to take care of i/o tasks. In this case OpenBLAS and 
Accelerate are likely to perform better than MKL.



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-17 Thread Sturla Molden

On 16/12/15 20:47, Derek Homeier wrote:


Getting around 30 s wall time here on a not so recent 4-core iMac, so that 
would seem to fit
(iirc Accelerate should actually largely be using the same machine code as MKL).


Yes, the same kernels, but not the same threadpool. Accelerate uses the 
GCD, MKL uses Intel TBB and Intel OpenMP (both of them). GCD scales 
better than TBB, even in Intel's own benchmarks. However, GCD uses a 
kernel threadpool (accesible via kqueue) which is not fork-safe, whereas 
MKL's threadpool is fork-safe (but will leak memory on fork).


Sturla




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Nathaniel Smith
What operating system are you on and how did you install numpy? From a
package manager, from source, by downloading from somewhere...?
On Dec 16, 2015 9:34 AM, "Edward Richards" 
wrote:

> I recently did a conceptual experiment to estimate the computational time
> required to solve an exact expression in contrast to an approximate
> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution
> requires a matrix inversion, and in my case the matrix would contain ~15000
> rows.
>
> On my machine MATLAB seems to perform this matrix inversion with random
> matrices about 9x faster (20 sec vs 3 mins). I thought the performance
> would be roughly the same because I presume both rely on the same LAPACK
> solvers.
>
> I will not actually need to solve this problem (even at 20 sec it is
> prohibitive for broadband simulation), but if I needed to I would
> reluctantly choose MATLAB . I am simply wondering why there is this
> performance gap, and if there is a better way to solve this problem in
> numpy?
>
> Thank you,
>
> Ned
>
> #Python version
>
> import numpy as np
>
> testA = np.random.randn(15000, 15000)
>
> testb = np.random.randn(15000)
>
> %time testx = np.linalg.solve(testA, testb)
>
> %MATLAB version
>
> testA = randn(15000);
>
> testb = randn(15000, 1);
> tic(); testx = testA \ testb; toc();
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Matthew Brett
Hi,

On Wed, Dec 16, 2015 at 6:34 PM, Edison Gustavo Muenz
 wrote:
> Sometime ago I saw this: https://software.intel.com/sites/campaigns/nest/
>
> I don't know if the "community" license applies in your case though. It is
> worth taking a look at.
>
> On Wed, Dec 16, 2015 at 4:30 PM, Francesc Alted  wrote:
>>
>> Sorry, I have to correct myself, as per:
>> http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda is
>> not linking with MKL by default (I thought that was the case before?).
>> After installing MKL (conda install mkl), I am getting:
>>
>> In [1]: import numpy as np
>> Vendor:  Continuum Analytics, Inc.
>> Package: mkl
>> Message: trial mode expires in 30 days
>>
>> In [2]: testA = np.random.randn(15000, 15000)
>>
>> In [3]: testb = np.random.randn(15000)
>>
>> In [4]: %time testx = np.linalg.solve(testA, testb)
>> CPU times: user 1min, sys: 468 ms, total: 1min 1s
>> Wall time: 15.3 s
>>
>>
>> so, it looks like you will need to buy a MKL license separately (which
>> makes sense for a commercial product).

If you're on a recent Mac, I would guess that the default
Accelerate-linked numpy / scipy will be in the same performance range
as those linked to the MKL, but I am happy to be corrected.

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Derek Homeier
On 16 Dec 2015, at 8:22 PM, Matthew Brett  wrote:
> 
>>> In [4]: %time testx = np.linalg.solve(testA, testb)
>>> CPU times: user 1min, sys: 468 ms, total: 1min 1s
>>> Wall time: 15.3 s
>>> 
>>> 
>>> so, it looks like you will need to buy a MKL license separately (which
>>> makes sense for a commercial product).
> 
> If you're on a recent Mac, I would guess that the default
> Accelerate-linked numpy / scipy will be in the same performance range
> as those linked to the MKL, but I am happy to be corrected.
> 
Getting around 30 s wall time here on a not so recent 4-core iMac, so that 
would seem to fit
(iirc Accelerate should actually largely be using the same machine code as MKL).

Cheers,
Derek



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Michael Sarahan
Continuum provides MKL free now - you just need to have a free anaconda.org
account to get the license: http://docs.continuum.io/mkl-optimizations/index

HTH,
Michael

On Wed, Dec 16, 2015 at 12:35 PM Edison Gustavo Muenz <
edisongust...@gmail.com> wrote:

> Sometime ago I saw this: https://software.intel.com/sites/campaigns/nest/
>
> I don't know if the "community" license applies in your case though. It is
> worth taking a look at.
>
> On Wed, Dec 16, 2015 at 4:30 PM, Francesc Alted  wrote:
>
>> Sorry, I have to correct myself, as per:
>> http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda
>> is not linking with MKL by default (I thought that was the case before?).
>> After installing MKL (conda install mkl), I am getting:
>>
>> In [1]: import numpy as np
>> Vendor:  Continuum Analytics, Inc.
>> Package: mkl
>> Message: trial mode expires in 30 days
>>
>> In [2]: testA = np.random.randn(15000, 15000)
>>
>> In [3]: testb = np.random.randn(15000)
>>
>> In [4]: %time testx = np.linalg.solve(testA, testb)
>> CPU times: user 1min, sys: 468 ms, total: 1min 1s
>> Wall time: 15.3 s
>>
>>
>> so, it looks like you will need to buy a MKL license separately (which
>> makes sense for a commercial product).
>>
>> Sorry for the confusion.
>> Francesc
>>
>>
>> 2015-12-16 18:59 GMT+01:00 Francesc Alted :
>>
>>> Hi,
>>>
>>> Probably MATLAB is shipping with Intel MKL enabled, which probably is
>>> the fastest LAPACK implementation out there.  NumPy supports linking with
>>> MKL, and actually Anaconda does that by default, so switching to Anaconda
>>> would be a good option for you.
>>>
>>> Here you have what I am getting with Anaconda's NumPy and a machine with
>>> 8 cores:
>>>
>>> In [1]: import numpy as np
>>>
>>> In [2]: testA = np.random.randn(15000, 15000)
>>>
>>> In [3]: testb = np.random.randn(15000)
>>>
>>> In [4]: %time testx = np.linalg.solve(testA, testb)
>>> CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s
>>> Wall time: 46.1 s
>>>
>>> This is not 20 sec, but it is not 3 min either (but of course that
>>> depends on your machine).
>>>
>>> Francesc
>>>
>>> 2015-12-16 18:34 GMT+01:00 Edward Richards :
>>>
 I recently did a conceptual experiment to estimate the computational
 time required to solve an exact expression in contrast to an approximate
 solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution
 requires a matrix inversion, and in my case the matrix would contain ~15000
 rows.

 On my machine MATLAB seems to perform this matrix inversion with random
 matrices about 9x faster (20 sec vs 3 mins). I thought the performance
 would be roughly the same because I presume both rely on the same
 LAPACK solvers.

 I will not actually need to solve this problem (even at 20 sec it is
 prohibitive for broadband simulation), but if I needed to I would
 reluctantly choose MATLAB . I am simply wondering why there is this
 performance gap, and if there is a better way to solve this problem in
 numpy?

 Thank you,

 Ned

 #Python version

 import numpy as np

 testA = np.random.randn(15000, 15000)

 testb = np.random.randn(15000)

 %time testx = np.linalg.solve(testA, testb)

 %MATLAB version

 testA = randn(15000);

 testb = randn(15000, 1);
 tic(); testx = testA \ testb; toc();

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 https://mail.scipy.org/mailman/listinfo/numpy-discussion


>>>
>>>
>>> --
>>> Francesc Alted
>>>
>>
>>
>>
>> --
>> Francesc Alted
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Francesc Alted
Sorry, I have to correct myself, as per:
http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda is
not linking with MKL by default (I thought that was the case before?).
After installing MKL (conda install mkl), I am getting:

In [1]: import numpy as np
Vendor:  Continuum Analytics, Inc.
Package: mkl
Message: trial mode expires in 30 days

In [2]: testA = np.random.randn(15000, 15000)

In [3]: testb = np.random.randn(15000)

In [4]: %time testx = np.linalg.solve(testA, testb)
CPU times: user 1min, sys: 468 ms, total: 1min 1s
Wall time: 15.3 s


so, it looks like you will need to buy a MKL license separately (which
makes sense for a commercial product).

Sorry for the confusion.
Francesc


2015-12-16 18:59 GMT+01:00 Francesc Alted :

> Hi,
>
> Probably MATLAB is shipping with Intel MKL enabled, which probably is the
> fastest LAPACK implementation out there.  NumPy supports linking with MKL,
> and actually Anaconda does that by default, so switching to Anaconda would
> be a good option for you.
>
> Here you have what I am getting with Anaconda's NumPy and a machine with 8
> cores:
>
> In [1]: import numpy as np
>
> In [2]: testA = np.random.randn(15000, 15000)
>
> In [3]: testb = np.random.randn(15000)
>
> In [4]: %time testx = np.linalg.solve(testA, testb)
> CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s
> Wall time: 46.1 s
>
> This is not 20 sec, but it is not 3 min either (but of course that depends
> on your machine).
>
> Francesc
>
> 2015-12-16 18:34 GMT+01:00 Edward Richards :
>
>> I recently did a conceptual experiment to estimate the computational time
>> required to solve an exact expression in contrast to an approximate
>> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution
>> requires a matrix inversion, and in my case the matrix would contain ~15000
>> rows.
>>
>> On my machine MATLAB seems to perform this matrix inversion with random
>> matrices about 9x faster (20 sec vs 3 mins). I thought the performance
>> would be roughly the same because I presume both rely on the same LAPACK
>> solvers.
>>
>> I will not actually need to solve this problem (even at 20 sec it is
>> prohibitive for broadband simulation), but if I needed to I would
>> reluctantly choose MATLAB . I am simply wondering why there is this
>> performance gap, and if there is a better way to solve this problem in
>> numpy?
>>
>> Thank you,
>>
>> Ned
>>
>> #Python version
>>
>> import numpy as np
>>
>> testA = np.random.randn(15000, 15000)
>>
>> testb = np.random.randn(15000)
>>
>> %time testx = np.linalg.solve(testA, testb)
>>
>> %MATLAB version
>>
>> testA = randn(15000);
>>
>> testb = randn(15000, 1);
>> tic(); testx = testA \ testb; toc();
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> Francesc Alted
>



-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Edward Richards
I recently did a conceptual experiment to estimate the computational 
time required to solve an exact expression in contrast to an approximate 
solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact 
solution requires a matrix inversion, and in my case the matrix would 
contain ~15000 rows.



On my machine MATLAB seems to perform this matrix inversion with random 
matrices about 9x faster (20 sec vs 3 mins). I thought the performance 
would be roughly the same because I presume both rely on the same LAPACK 
solvers.



I will not actually need to solve this problem (even at 20 sec it is 
prohibitive for broadband simulation), but if I needed to I would 
reluctantly choose MATLAB . I am simply wondering why there is this 
performance gap, and if there is a better way to solve this problem in 
numpy?



Thank you,

Ned


#Python version

import numpy as np

testA = np.random.randn(15000, 15000)

testb = np.random.randn(15000)

%time testx = np.linalg.solve(testA, testb)


%MATLAB version

testA = randn(15000);

testb = randn(15000, 1);

tic(); testx = testA \ testb; toc();
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Francesc Alted
Hi,

Probably MATLAB is shipping with Intel MKL enabled, which probably is the
fastest LAPACK implementation out there.  NumPy supports linking with MKL,
and actually Anaconda does that by default, so switching to Anaconda would
be a good option for you.

Here you have what I am getting with Anaconda's NumPy and a machine with 8
cores:

In [1]: import numpy as np

In [2]: testA = np.random.randn(15000, 15000)

In [3]: testb = np.random.randn(15000)

In [4]: %time testx = np.linalg.solve(testA, testb)
CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s
Wall time: 46.1 s

This is not 20 sec, but it is not 3 min either (but of course that depends
on your machine).

Francesc

2015-12-16 18:34 GMT+01:00 Edward Richards :

> I recently did a conceptual experiment to estimate the computational time
> required to solve an exact expression in contrast to an approximate
> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution
> requires a matrix inversion, and in my case the matrix would contain ~15000
> rows.
>
> On my machine MATLAB seems to perform this matrix inversion with random
> matrices about 9x faster (20 sec vs 3 mins). I thought the performance
> would be roughly the same because I presume both rely on the same LAPACK
> solvers.
>
> I will not actually need to solve this problem (even at 20 sec it is
> prohibitive for broadband simulation), but if I needed to I would
> reluctantly choose MATLAB . I am simply wondering why there is this
> performance gap, and if there is a better way to solve this problem in
> numpy?
>
> Thank you,
>
> Ned
>
> #Python version
>
> import numpy as np
>
> testA = np.random.randn(15000, 15000)
>
> testb = np.random.randn(15000)
>
> %time testx = np.linalg.solve(testA, testb)
>
> %MATLAB version
>
> testA = randn(15000);
>
> testb = randn(15000, 1);
> tic(); testx = testA \ testb; toc();
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance solving system of equations in numpy and MATLAB

2015-12-16 Thread Edison Gustavo Muenz
Sometime ago I saw this: https://software.intel.com/sites/campaigns/nest/

I don't know if the "community" license applies in your case though. It is
worth taking a look at.

On Wed, Dec 16, 2015 at 4:30 PM, Francesc Alted  wrote:

> Sorry, I have to correct myself, as per:
> http://docs.continuum.io/mkl-optimizations/index it seems that Anaconda
> is not linking with MKL by default (I thought that was the case before?).
> After installing MKL (conda install mkl), I am getting:
>
> In [1]: import numpy as np
> Vendor:  Continuum Analytics, Inc.
> Package: mkl
> Message: trial mode expires in 30 days
>
> In [2]: testA = np.random.randn(15000, 15000)
>
> In [3]: testb = np.random.randn(15000)
>
> In [4]: %time testx = np.linalg.solve(testA, testb)
> CPU times: user 1min, sys: 468 ms, total: 1min 1s
> Wall time: 15.3 s
>
>
> so, it looks like you will need to buy a MKL license separately (which
> makes sense for a commercial product).
>
> Sorry for the confusion.
> Francesc
>
>
> 2015-12-16 18:59 GMT+01:00 Francesc Alted :
>
>> Hi,
>>
>> Probably MATLAB is shipping with Intel MKL enabled, which probably is the
>> fastest LAPACK implementation out there.  NumPy supports linking with MKL,
>> and actually Anaconda does that by default, so switching to Anaconda would
>> be a good option for you.
>>
>> Here you have what I am getting with Anaconda's NumPy and a machine with
>> 8 cores:
>>
>> In [1]: import numpy as np
>>
>> In [2]: testA = np.random.randn(15000, 15000)
>>
>> In [3]: testb = np.random.randn(15000)
>>
>> In [4]: %time testx = np.linalg.solve(testA, testb)
>> CPU times: user 5min 36s, sys: 4.94 s, total: 5min 41s
>> Wall time: 46.1 s
>>
>> This is not 20 sec, but it is not 3 min either (but of course that
>> depends on your machine).
>>
>> Francesc
>>
>> 2015-12-16 18:34 GMT+01:00 Edward Richards :
>>
>>> I recently did a conceptual experiment to estimate the computational
>>> time required to solve an exact expression in contrast to an approximate
>>> solution (Helmholtz vs. Helmholtz-Kirchhoff integrals). The exact solution
>>> requires a matrix inversion, and in my case the matrix would contain ~15000
>>> rows.
>>>
>>> On my machine MATLAB seems to perform this matrix inversion with random
>>> matrices about 9x faster (20 sec vs 3 mins). I thought the performance
>>> would be roughly the same because I presume both rely on the same
>>> LAPACK solvers.
>>>
>>> I will not actually need to solve this problem (even at 20 sec it is
>>> prohibitive for broadband simulation), but if I needed to I would
>>> reluctantly choose MATLAB . I am simply wondering why there is this
>>> performance gap, and if there is a better way to solve this problem in
>>> numpy?
>>>
>>> Thank you,
>>>
>>> Ned
>>>
>>> #Python version
>>>
>>> import numpy as np
>>>
>>> testA = np.random.randn(15000, 15000)
>>>
>>> testb = np.random.randn(15000)
>>>
>>> %time testx = np.linalg.solve(testA, testb)
>>>
>>> %MATLAB version
>>>
>>> testA = randn(15000);
>>>
>>> testb = randn(15000, 1);
>>> tic(); testx = testA \ testb; toc();
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>>
>> --
>> Francesc Alted
>>
>
>
>
> --
> Francesc Alted
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion