from:"Sturla Molden"

Re: [Numpy-discussion] OS X wheels: speed versus multiprocessing

2015-04-06 Thread Sturla Molden

On 07/04/15 02:19, Matthew Brett wrote:

> ATLAS compiled with gcc also gives us some more license complication:
>
> http://numpy-discussion.10968.n7.nabble.com/Copyright-status-of-NumPy-binaries-on-Windows-OS-X-tp38793p38824.html

Ok, then I have a question regarding OpenBLAS:

Do we use the f2c'd lapack_lite or do we build LAPACK with gfortran and 
link into OpenBLAS? In the latter case we might get the libquadmath 
linked into the OpenBLAS binary as well.

> I agree that big slowdowns would be dangerous for numpy's reputation.
>
> Sturla - do you have a citable source for your factor of 20 figure?

I will look it up. The best thing would be to do a new benchmark though.

Another thing is it depends on the hardware. ATLAS is not very scalable 
on multiple processors, so it will be worse on a Mac Pro than a Macbook. 
It will also we worse with AVX than without.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] OS X wheels: speed versus multiprocessing

2015-04-06 Thread Sturla Molden

On 07/04/15 02:13, Sturla Molden wrote:

> Most of the test failures with OpenBLAS and Carl Kleffner's toolchain on
> Windows are due to differences between Microsoft and MinGW runtime
> libraries

... and also differences in FPU precision.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] OS X wheels: speed versus multiprocessing

2015-04-06 Thread Sturla Molden

On 07/04/15 01:49, Nathaniel Smith wrote:

> Any opinions, objections?

Accelerate does not break multiprocessing, quite the opposite. The bug 
is in multiprocessing and has been fixed in Python 3.4.

My vote would nevertheless be for OpenBLAS if we can use it without 
producing test failures in NumPy and SciPy.

Most of the test failures with OpenBLAS and Carl Kleffner's toolchain on 
Windows are due to differences between Microsoft and MinGW runtime 
libraries and not due to OpenBLAS itself. These test failures are not 
relevant on Mac.

ATLAS can easily reduce the speed of a matrix product or a linear 
algebra call with a factor of 20 compared to Accelerate, MKL or 
OpenBLAS. It would give us bad karma.


Sturla




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] IDE's for numpy development?

2015-04-06 Thread Sturla Molden

On 06/04/15 20:33, Suzen, Mehmet wrote:
> Hi Chuck,
>
> Spider is good. If you are coming from Matlab world.
>
> http://spyder-ide.blogspot.co.uk/
>
> I don't think it supports C. But Maybe you are after Eclipse.

Spyder supports C.


Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] How to Force Storage Order

2015-04-01 Thread Sturla Molden

"Klemm, Michael"  wrote:

> I have found that the numpy.linalg.svd algorithm creates the resulting U,
> sigma, and V matrixes with Fortran storage.  Is there any way to force
> these kind of algorithms to not change the storage order?  That would
> make passing the matrixes to the native dgemm operation much easier.

NumPy's dot function will call cblas_dgemm in the most efficient way
regardless of storage. It knows what to do with C and Fortran arrays.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] IDE's for numpy development?

2015-04-01 Thread Sturla Molden

Charles R Harris  wrote:

> I'd be
> interested in information from anyone with experience in using such an IDE
> and ideas of how Numpy might make using some of the common IDEs easier.
> 
> Thoughts?

I guess we could include project files for Visual Studio (and perhaps
Eclipse?), like Python does. But then we would need to make sure the
different build systems are kept in sync, and it will be a PITA for those
who do not use Windows and Visual Studio. It is already bad enough with
Distutils and Bento. I, for one, would really prefer if there only was one
build process to care about. One should also note that a Visual Studio
project is the only supported build process for Python on Windows. So they
are not using this in addition to something else.

Eclipse is better than Visual Studio for mixed Python and C development. It
is also cross-platform.

cmake needs to be mentioned too. It is not fully integrated with Visual
Studio, but better than having multiple build processes.

But still, there is nothing that prevents the use of Visual Studio as a
glorified text editor.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration"

2015-03-11 Thread Sturla Molden

On 11/03/15 23:20, Dp Docs wrote:

> So we are supposed to integrate just one of these libraries?

As a Mac user I would be annoyed if we only supported MKL and not 
Accelerate Framework. AMD LibM should be supported too.

MKL is non-free, but we use it for BLAS and LAPACK. AMD LibM is non-free 
in a similar manner. Acelerate Framework (vecLib) is a part of Apple's 
operating systems. You can abstract out the differences.

Eigen is C++. We do not use C++ in NumPy, only C, Python and some 
Cython. C++ and Fortran can be used in SciPy, but not in NumPy.


Apple's reference is here:

https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vecLib/index.html


AMD's information is here:

http://developer.amd.com/tools-and-sdks/cpu-development/libm/

(Vector math functions seem to be moved from ACML to LibM)


Intel's reference is here:

https://software.intel.com/sites/products/documentation/doclib/iss/2013/mkl/mklman/GUID-59EC4B87-29C8-4FB4-B57C-D269E6364954.htm



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] SIMD programming in Cython

2015-03-11 Thread Sturla Molden


So I just learned a new trick. This is a very nice one which can be nice 
to know about, so I thought I should share this:

https://groups.google.com/forum/#!msg/cython-users/nTnyI7A6sMc/a6_GnOOsLuQJ



Regards,
Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration"

2015-03-11 Thread Sturla Molden

There are several vector math libraries NumPy could use, e.g. MKL/VML, 
Apple Accelerate (vecLib), ACML, and probably others.

They all suffer from requiring dense arrays and specific array 
alignments, whereas NumPy arrays have very flexible strides and flexible 
alignment. NumPy also has ufuncs and gufuncs as a complicating factor.

There are at least two ways to proceed here. One is to only use vector 
math when strides and alignment allow it. The other is to build a vector 
math library specifically for NumPy arrays and (g)ufuncs. The latter you 
will most likely not be able to do in a summer.

You should also consider Numba and Numexpr. They have some support for 
vector math libraries too.

Sturla

On 08/03/15 21:47, Dp Docs wrote:
> Hi all,
> I am a CS 3rd Undergrad. Student from an Indian Institute (III T). I
> believe I am good in Programming languages like C/C++, Python as I
> have already done Some Projects using these language as a part of my
> academics. I really like Coding (Competitive as well as development).
> I really want to get involved in Numpy Development Project and want to
> take  "Vector math library integration" as a part of my project. I
> want to here any idea from your side for this project.
> Thanks For your time for reading this email and responding back.
>
> My IRCnickname: dp
>
> Real Name: Durgesh Pandey.
>

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Would like to patch docstring for numpy.random.normal

2015-03-10 Thread Sturla Molden

On 09/03/15 21:34, Paul Hobson wrote:
> I feel your pain. Making it worse, numpy.random.lognormal takes "mean"
> and "sigma" as input. If there's ever a backwards incompatible release,
> I hope these things will be cleared up.

The question is how...

The fix is obvious, but the consequences are unacceptable.

Sturla



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Would like to patch docstring for numpy.random.normal

2015-03-03 Thread Sturla Molden

Daniel Sank  wrote:
 
> It seems unnecessarily convoluted to name the input arguments "loc" and
> "scale", then immediately define them as the "mean" and "standard
> deviation" in the Parameters section, and then again rename them as "mu"
> and "sigma" in the written formula. I propose to simply change the argument
> names to "mean" and "sigma" to improve consistency.

Change the name of keyword arguments? 

This would be evil.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] So I found a bug...

2015-02-27 Thread Sturla Molden

On 28/02/15 00:04, Robert Kern wrote:

> When plt.imshow() is given floating point RGB images, it assumes that
> each channel is normalized to 1. You are mixing a 0..255 image with a
> 0..1 image. Divide `lenna` by 255.0 before you stack it with `_dct`. Or
> multiply `_dct` by 255 and cast it to uint8.

Right. Thanks.

Since it's past midnight this probably means I should not touch the 
computer until my brain has rebooted.




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] So I found a bug...

2015-02-27 Thread Sturla Molden

On 27/02/15 23:39, Benjamin Root wrote:
> Is this another white-gold vs. blue-black dress color thing?

No. It is what you said in you next post.

I hate that dress image. The first time I looked at it it was white and 
gold, then it became blue and black, and the third time it was grayish 
blue and bronze. I'm not going to look at it again, it might have 
exploit code to plant malware in my brain.

Sturla





>
> Ben Root
>
> On Fri, Feb 27, 2015 at 5:33 PM, Sturla Molden  <mailto:sturla.mol...@gmail.com>> wrote:
>
> Somewhere... But where is it?
>
> NumPy, SciPy, Matplotlib, Cython or ipython?
>
> I am suspecting ipython, but proving it is hard...
>
> 
> http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb
>
>
> Sturla
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] So I found a bug...

2015-02-27 Thread Sturla Molden

On 27/02/15 23:40, Benjamin Root wrote:

> oh...  I think I see what you are referring to. The second image should
> have the regular lenna image, not the negative?

Yeah. The ndarray references are getting messed up. It is actually quite 
serious.

Sturla


>
> Ben Root
>
> On Fri, Feb 27, 2015 at 5:39 PM, Benjamin Root  <mailto:ben.r...@ou.edu>> wrote:
>
> It is Friday evening here... I must be really dense. What bug are we
> looking at? Is this another white-gold vs. blue-black dress color thing?
>
> Ben Root
>
>     On Fri, Feb 27, 2015 at 5:33 PM, Sturla Molden
> mailto:sturla.mol...@gmail.com>> wrote:
>
> Somewhere... But where is it?
>
> NumPy, SciPy, Matplotlib, Cython or ipython?
>
> I am suspecting ipython, but proving it is hard...
>
> 
> http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb
>
>
> Sturla
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] So I found a bug...

2015-02-27 Thread Sturla Molden

Somewhere... But where is it?

NumPy, SciPy, Matplotlib, Cython or ipython?

I am suspecting ipython, but proving it is hard...

http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/12464039/lenna-bug.ipynb


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] One-byte string dtype: third time's the charm?

2015-02-22 Thread Sturla Molden

On 22/02/15 20:57, Nathaniel Smith wrote:

> This is a discussion about how strings are represented as bit-patterns
> inside ndarrays; the internal storage representation used by 'str' is
> irrelevant.

I thought it would be clever to just use the same internal 
representation as Python would choose. But obviously it is not. UTF-8 
would fail because it is not regularly stored. And every string in an 
ndarray will need to have the same encoding, but Python might think 
otherwise.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] One-byte string dtype: third time's the charm?

2015-02-22 Thread Sturla Molden

On 22/02/15 21:04, Robert Kern wrote:

> Python 3's `str` type is opaque, so it can
> freely choose how to represent the data in memory. numpy dtypes
> transparently describe how the data is represented in memory.

Hm, yes, that is a good point.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] One-byte string dtype: third time's the charm?

2015-02-22 Thread Sturla Molden

On 22/02/15 19:21, Aldcroft, Thomas wrote:

> Problems like this are now showing up in the wild [3].  Workarounds are
> also showing up, like a way to easily convert from 'S' to 'U' within
> astropy Tables [4], but this is really not a desirable way to go.
> Gigabyte-sized string data arrays are not uncommon, so converting to
> UCS-4 is a real memory and performance hit.

Why UCS-4? The Python's internal "flexible string respresentation" will 
use ascii for ascii text.

By PEP 393 an application should not assume an internal string 
representation at all:

https://www.python.org/dev/peps/pep-0393/

If the problem is PEP 393 violation in NumPy string or unicode dtype, we 
shouldn't violate it even further by adding a latin-1 encoded ascii 
string. We should let Python represent strings as it wants, and it will 
not bloat.

I am m -1 on adding latin-1 and +1 on making the unicode dtype PEP 393 
compliant if it is not. And on Python 3 'U' and 'S' should just be synonyms.

You can also store an array of bytes with uint8. Then you can decode it 
however you like to make a Python string. If it is encoded as latin-1 
then decode it as latin-1:

In [1]: import numpy as np

In [2]: ascii_bytestr = "The quick brown fox jumps over the lazy 
dog".encode('latin-1')

In [3]: numpy_bytestr = np.array(memoryview(ascii_bytestr))

In [4]: numpy_bytestr.dtype, numpy_bytestr.shape
Out[4]: (dtype('uint8'), (43,))

In [5]: bytes(numpy_bytestr).decode('latin-1')
Out[5]: 'The quick brown fox jumps over the lazy dog'

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Nature says 'Pick up Python'

2015-02-13 Thread Sturla Molden

A recent article in Nature advice scientists to use Python, Cython and 
the SciPy stack.

http://www.nature.com/news/programming-pick-up-python-1.16833


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-10 Thread Sturla Molden

Chris Barker  wrote:

> Well, splitting it off is a good idea, seeing as how it hasn't gotten much
> love. But if the rest of numpy does not work well with it, then it becomes
> even less useful.

PEP 3118 takes care of that.



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-10 Thread Sturla Molden

Chris Barker  wrote:

>  The strongest use-case seems to be
> for teaching that involves linear algebra concepts, not real production
> code.

Not really. SymPy is a better teaching tool.

Some find A*B easier to read than dot(A,B). But with the @ operator in
Python 3.5 it does not have a usecase at all.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental)

2015-02-09 Thread Sturla Molden

Two quick comments:
- You need MSYS or Cygwin to build OpenBLAS. MSYS has uname and perl. Carl
probably used MSYS.
- BLAS and LAPACK are Fortran libs, hence there are no header files. NumPy
and SciPy include their own cblas headers.

Sturla


Olivier Grisel  wrote:
> Hi Carl,
> 
> Could you please provide some details on how you used your
> mingw-static toolchain to build OpenBLAS, numpy & scipy? I would like
> to replicate but apparently the default Makefile in the openblas
> projects expects unix commands such as `uname` and `perl` that are not
> part of your archive. Did you compile those utilities from source or
> did you use another distribution of mingw with additional tools such
> as MSYS?
> 
> For numpy and scipy, besides applying your patches, did you configure
> anything in site.cfg? I understand that you put the libopenblas.dll in
> the numpy/core folder but where do you put the BLAS / LAPACK header
> files?
> 
> I would like to help automating that build in some CI environment
> (with either Windows or Linux + wine) but I am affraid that I am not
> familiar enough with the windows build of numpy & scipy to get it
> working all by myself.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-09 Thread Sturla Molden

Chris Barker  wrote:

> Do you realize that:
> 
> arr = np.ones((5,))
> 
> ar2 = arr * 5
> 
> is broadcasting, too?

Perhaps we should only warn for a subset of broadcastings? E.g. avoid the
warning on scalar times array. 

I prefer we don't warn about this though, because it might be interpreted
as if broadcasting is "undesired". A warning means something bad right?
There are just two things that can come out if this: First some stupid
package author will turn them on and cause warning mayhem everywhere.
Second, some stupid manager will decide that the NumPy code should be free
of broadcasts, and then NumPy is crippled for the developers.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-09 Thread Sturla Molden

On 09/02/15 08:34, Stefan Reiterer wrote:

> So maybe the better way would be not to add warnings to braodcasting
> operations, but to overhaul the matrix class
> to make it more attractive for numerical linear algebra(?)

I think you underestimate the amount of programming this would take. 
Take an arch LA trait of Matlab, the backslash operator.

If you have an expression like a \ b it is evaluated as solve(a,b). But 
how should you solve this? LU, QR, Cholesky, SVD? Exact? Least-squares? 
Surely that depends on the context and the array.

Matlab's backslash operators amounts to about 100,000 LOC.

Let's say we add attributes to the Matrix class to symbolically express 
things like inversion, symmetric matrix, triangular matrix, tridiagonal 
matrix, etc.

Say that you know a is symmetric and positive definite and b 
rectangular. How would you solve this most efficiently with BLAS and LAPACK?

(a**-1) * b

In SciPy notation this is cho_solve(cho_factor(A),B). But what if you 
know that B has a certain structure?

And what if the arrays are in C order instead of Fortran order? Is there 
a way to avoid copy-and-transpose? NumPy's dot operator does this, but 
LA solvers should too. Put that on top of kernels for evalutationg any 
kind of (a**-1) * b expressions.

But what if the orders are reversed?

b * (a**-1)

Those will also need special casings.

A LA package would need specialized code for any of these thinkable 
combinations, and pick the best one in a fly. That is what Matlab's 
backslash operator can do. NumPy cannot. But it is not difficult if you 
want to sit down and program it. If would just take you about 100,000 to 
half a million LOC. And chances are you effort will not even be appreciated.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-08 Thread Sturla Molden

On 08/02/15 23:17, Stefan Reiterer wrote:

> Actually I use numpy for several years now, and I love it.
> The reason that I think silent broadcasting of sums is bad
> comes simply from the fact, that I had more trouble with it, than it
> helped me.

In Fortran 90, broadcasting allows us to write concise expressions which 
are compiled into very efficient machine code. This would be very 
difficult for the compiler if the code used explicit repeats and 
reshapes instead of broadcasting.

In NumPy and Matlab, this aspect is not equally important. But 
broadcasting makes array expressions more terse, and is save some 
redundant array copies. NumPy code tends to be memory bound; 
broadcasting can therefore improve performance, particularly when arrays 
are large. But the effect is not as dramatic as it is in Fortran.

Readability is also important. Vectorized Matlab code quickly 
degenerates into a mess of reshapes and repmats, and can be very hard to 
read. NumPy and Fortran 90 codes do not loose the readability like this 
even in the most complicated array expressions.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-08 Thread Sturla Molden

Simon Wood  wrote:

> Not quite the same. This is not so much about language semantics as
> mathematical definitions. You (the Numpy community) have decided to
> overload certain mathematical operators to act in a way that is not
> consistent with linear algebra teachings.

We have overloaded the operators to make them work like Fortran 90. That is
about as hardcore numerical computing semantics as you get. 

> This can be a bit confusing for
> people who develop and implement mathematical algorithms that have a strong
> foundation in linear algebra, irrespective of the language they are
> migrating from.

Those who develop such algorithms are probably going to use BLAS and
LAPACK. Then by deduction they know about Fortran, and then by deduction
they know about array broadcasting.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Silent Broadcasting considered harmful

2015-02-08 Thread Sturla Molden

Matthew Brett  wrote:

> I agree.  I knew about broadcasting as soon as I started using numpy,
> so I can honestly say this has never surprised me.

Fortran 90 has broadcasting too. NumPy's broadcasting was inspired by
Fortran 90, which was the lingua franca of scientific computing in the
1990s. Like NumPy, Fortran 90 is an array language, not a matrix language
like Matlab.

> There are other major incompatibilities between Matlab and numpy, such
> as 0-based indices, and array views.

Yes. NumPy is not Matlab and not intending to clone Matlab. Those who want
Matlab know where to find it. Those who need a free Matlab clone can
download Scilab.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Any interest in a 'heaviside' ufunc?

2015-02-04 Thread Sturla Molden

On 04/02/15 06:18, Warren Weckesser wrote:

> By "discrete form", do you mean discrete time (i.e. a function defined
> on the integers)?  Then I agree, the discrete time unit step function is
> defined as

It is the cumulative integral of the delta function, and thus it can 
never obtain the value 0.5. The delta function is defined to have an 
integral of 0 or 1.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Any interest in a 'heaviside' ufunc?

2015-02-03 Thread Sturla Molden

Warren Weckesser  wrote:

> 0if x < 0
> heaviside(x) =  0.5  if x == 0
> 1if x > 0
> 

This is not correct. The discrete form of the Heaviside step function has
the value 1 for x == 0.

heaviside = lambda x : 1 - (x < 0).astype(int)

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental)

2015-01-27 Thread Sturla Molden

On 27/01/15 11:32, Carl Kleffner wrote:

> OpenBLAS in the test wheels is build with DYNAMIC_ARCH, that is all
> assembler based kernels are included and are choosen at runtime.

Ok, I wasn't aware of that option. Last time I built OpenBLAS I think I 
had to specify the target CPU.

 > Non
> optimized parts of Lapack have been build with -march=sse2.

Since LAPACK delegates almost all of its heavy lifting to BLAS, there is 
probably not a lot to gain from SSE3, SSE4 or AVX here.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental)

2015-01-26 Thread Sturla Molden

On 26/01/15 16:30, Carl Kleffner wrote:

> Thanks for all your ideas. The next version will contain an augumented
> libopenblas.dll  in both numpy and scipy. On the long term I would
> prefer an external openblas wheel package, if there is an agreement
> about this among numpy-dev.

Thanks for all your great work on this.

An OpenBLAS wheel might be a good idea. Probably we should have some 
sort of instruction on the website how to install the binary wheel. And 
then we could include the OpenBLAS wheel in the instruction. Or we could 
have the OpenBLAS wheel as a part of the scipy stack.

But make the bloated SciPy and NumPy wheels work first, then we can 
worry about a dedicated OpenBLAS wheel later :-)

> Another idea for the future is to conditionally load a debug version of
> libopenblas instead. Together with the backtrace.dll (part of
> mingwstatic, but undocumentated right now) a meaningfull stacktrace in
> case of segfaults inside the code comiled with mingwstatic will be given.

An OpenBLAS wheel could also include multiple architectures. We can 
compile OpenBLAS for any kind of CPUs and and install the one that fits 
best with the computer.

Also note that an OpenBLAS wheel could be useful on Linux. It is clearly 
superior to the ATLAS libraries that most distros ship. If we make a 
binary wheel that works for Windows, we are almost there for Linux too :-)

For Apple we don't need OpenBLAS anymore. On OSX 10.9 and 10.10 
Accelerate Framework is actually faster than MKL under many 
circumstances. DGEMM is about the same, but e.g. DAXPY and DDOT are 
faster in Accelerate.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental)

2015-01-25 Thread Sturla Molden

On 25/01/15 22:15, Matthew Brett wrote:

> I agree, that shipping openblas with both numpy and scipy seems
> perfectly reasonable to me - I don't think anyone will much care about
> the 30M, and I think our job is to make something that works with the
> least complexity and likelihood of error.

Yes. Make something that works first, optimize for space later.


> It would be good to rename the dll according to the package and
> version though, to avoid a scipy binary using a pre-loaded but
> incompatible 'libopenblas.dll'.   Say something like
> openblas-scipy-0.15.1.dll - on the basis that there can only be one
> copy of scipy loaded at a time.

That is a good idea and we should do this for NumPy too I think.



Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental)

2015-01-25 Thread Sturla Molden

Carl Kleffner  wrote:

> I very much prefer dynamic linking to numpy\core\libopenblas.dll instead of
> static linking to avoid bloat. This matters, because libopenblas.dll is a
> heavy library (around 30Mb for amd64). As a consequence all packages with
> dynamic linkage to OpenBLAS depend on numpy-openblas. This is not different
> to scipy-MKL that has a dependancy to numpy-MKL - see C. Gohlke's site.

It is probably ok if we name the OpenBLAS DLL something else than
libopenblas.dll. We could e.g. add to the filename a combined hash for
NumPy version, CPU, OpenBLAS version, Python version, C compiler, platform,
build number, etc.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new mingw-w64 based numpy and scipy wheel (still experimental)

2015-01-22 Thread Sturla Molden

Were there any failures with the 64 bit build, or did all tests pass?

Sturla


On 22/01/15 22:29, Carl Kleffner wrote:
> I took time to create mingw-w64 based wheels of numpy-1.9.1 and
> scipy-0.15.1 source distributions and put them on
> https://bitbucket.org/carlkl/mingw-w64-for-python/downloads as well as
> on binstar.org . The test matrix is python-2.7 and
> 3.4 for both 32bit and 64bit.
>
> Feedback is welcome.
>
> The wheels can be pip installed with:
>
> pip install -i https://pypi.binstar.org/carlkl/simple numpy
> pip install -i https://pypi.binstar.org/carlkl/simple scipy
>
> Some technical details: the binaries are build upon OpenBLAS as
> accelerated BLAS/Lapack. OpenBLAS itself is build with dynamic kernels
> (similar to MKL) and automatic runtime selection depending on the CPU.
> The minimal requested feature supplied by the CPU is SSE2. SSE1 and
> non-SSE CPUs are not supported with this builds. This is the default for
> 64bit binaries anyway.
>
> OpenBLAS is deployed as part of the numpy wheel. That said, the scipy
> wheels mentioned above are dependant on the installation of the OpenBLAS
> based numpy and won't work i.e. with an installed  numpy-MKL.
>
> For the numpy 32bit builds there are 3 failures for special FP value
> tests, due to a bug in mingw-w64 that is still present. All scipy
> versions show up 7 failures with some numerical noise, that could be
> ignored (or corrected with relaxed asserts in the test code).
>
> PR's for numpy and scipy are in preparation. The mingw-w64 compiler used
> for building can be found at
> https://bitbucket.org/carlkl/mingw-w64-for-python/downloads.
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Aligned array allocations

2015-01-22 Thread Sturla Molden

Antoine Pitrou  wrote:

> By always using an aligned allocator there is some overhead:
> - all arrays occupy a bit more memory by a small average amount
>   (probably 16 bytes average on a 64-bit machine, for a 16 byte
>   guaranteed alignment)

NumPy arrays are Python objects. They have an overhead anyway, much more
than this, and 16 bytes are not worse than adding a couple of pointers to
the struct. In the big picture this tiny overhead does not matter.

> - array resizes can be more expensive in CPU time, when the physical
>   start changes and its alignment changes too

We are using Python. If we were worried about small inefficiencies we would
not be using it. Resizing ndarrays are rare anyway. They are not used like
Python lists or instead of lists. We use lists in the same way as anyone
else who uses Python. So an ndarray resize can afford to be more espensive
than a list append.

Also the NumPy community expects an ndarray resize to be expensive and O(n)
due to its current behavior: If an array has a view, realloc is out of the
question.

:-) 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Characteristic of a Matrix.

2015-01-06 Thread Sturla Molden

On 06/01/15 02:08, cjw wrote:

> This is not a comment on any present matrix support, but deals with the
> matrix class, which existed back when Todd Miller of the Space Telescope
> Group supported numpy.
>
> Matrix is a sub-class of ndarray.

Since this Matrix class is (more or less) deprecated and its use 
discouraged, I think it should just be left as it is.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] The future of ndarray.diagonal()

2015-01-05 Thread Sturla Molden

Konrad Hinsen  wrote:

> Scientific communication depends more and more on scripts as the only 
> precise documentation of a computational method. Our programming 
> languages are becoming a major form of scientific notation, alongside 
> traditional mathematics.

To me it seems that algorithms in scientific papers and books are described
in various forms of pseudo-code. Perhaps we need a notation which is
universal and ethernal like the language mathematics. But I am not sure
Python could or should try to be that "scripting" language.

I also think it is reasonable to ask if journals should require code as
algorithmic documentation to be written in some ISO standard language like
C or Fortran 90. The behavior of Python and NumPy are not dictated by
standards, and as such is not better than pseudo-code.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] The future of ndarray.diagonal()

2015-01-04 Thread Sturla Molden

On 03/01/15 20:49, Nathaniel Smith wrote:

> i.e., slow-incremental-change has actually worked well in his
> experience. (And in particular, the np.diagonal issue only comes in as
> an example to illustrate what he means by the phrase "slow continuous
> change" -- this particular change hasn't actually broken anything in his
> code.) OTOH the big problem that motivated his post was that his code is
> all written against the APIs of the ancient and long-abandoned Numeric
> project, and he finds the costs of transitioning them to the "new" numpy
> APIs to be prohibitively expensive, i.e. this big-bang transition broke
> his code.

Given that a big-bang transition broke his code everywhere, I don't 
really see why he wants more of them.

The question of reproducible research is orthogonal to this, I think.


Sturla









___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] The future of ndarray.diagonal()

2015-01-04 Thread Sturla Molden

On 04/01/15 17:22, Konrad Hinsen wrote:

> There are two different scenarios to consider here, and perhaps I didn't
> make that distinction clear enough. One scenario is that of a maintained
> library or application that depends on NumPy. The other scenario is a
> set of scripts written for a specific study (let's say a thesis) that is
> then published for its documentation value. Such scripts are in general
> not maintained.

> It's the second scenario where gradual changes are a real problem.

> Suppose I have a set of scripts from a thesis published in year X, and I
> need to understand them in detail in year X+5 for a related scientific
> project. If the script produces different results with NumPy 1.7 and
> NumPy 1.10, which result should I assume the author intended?



A scientific paper or thesis should be written so it is completely 
reproducible. That would include describing the computer, OS, Python 
version and NumPy version, as well as C or Fortran compiler.

I will happily fail any student who writes a thesis without providing 
such details, and if I review a research paper for a journal you can be 
sure I will ask that is corrected.


Sturla



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] diag, diagonal, ravel and all that

2015-01-03 Thread Sturla Molden

On 03/01/15 03:04, Charles R Harris wrote:

> The diag, diagonal, and ravel functions have recently been changed to
> preserve subtypes. However, this causes lots of backward compatibility
> problems for matrix users, in particular, scipy.sparse. One possibility
> for fixing this is to special case matrix and so that these functions
> continue to return 1-d arrays for matrix instances. This is kind of ugly
> as `a..ravel` will still return a matrix when a is a matrix, an ugly
> inconsistency. This may be a case where  practicality beats beauty.
>
> Thoughts?

What about fixing scipy.sparse?


Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Correct C string handling in the NumPy C API?

2015-01-03 Thread Sturla Molden

Sturla Molden  wrote:

> Thanks. That explains it.

20 years after learning C I still discover new things... 

On the other hand, Fortran is Fortran, and seems to be free of these
gotchas... 

Python is better as well.

I hate to say it but C++ would also be less confusing here. I would just
pass in a reference to a std::string and assign to it, which I know is
safe...

On the other hand, implicit static storage of string litterals -- which may
or may not be modified depending on compiler? Not so obvious without
reading all the small details...

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Correct C string handling in the NumPy C API?

2015-01-03 Thread Sturla Molden

Nathaniel Smith  wrote:

> No, this code is safe (fortunately!). C string literals have "static
> storage" (see paragraph 6.4.5.5 in C99), which means that their
> lifetime is the same as the lifetime of a 'static char[]'. They aren't
> stack allocated.

Thanks. That explains it.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Correct C string handling in the NumPy C API?

2015-01-03 Thread Sturla Molden

Here is an example:

NPY_NO_EXPORT NpyIter_IterNextFunc *
NpyIter_GetIterNext(NpyIter *iter, char **errmsg)
{
 npy_uint32 itflags = NIT_ITFLAGS(iter);
 int ndim = NIT_NDIM(iter);
 int nop = NIT_NOP(iter);

 if (NIT_ITERSIZE(iter) < 0) {
 if (errmsg == NULL) {
 PyErr_SetString(PyExc_ValueError, "iterator is too large");
 }
 else {
 *errmsg = "iterator is too large";
 }
 return NULL;
 }


After NpyIter_GetIterNext returns, *errmsg points to a local variable in 
a returned function.

Either I am wrong about C, or this code has undefied behavior...

My gutfeeling is that

*errmsg = "iterator is too large";

puts the string "iterator is too large" on the stack and points *errmsg 
to the string.

Shouldn't this really be

strcpy(*errmsg, "iterator is too large");

and then *errmsg should point to a char buffer allocated before 
NpyIter_GetIterNext is called?

Or will the statement

*errmsg = "iterator is too large";

put the string on the stack in the calling C function?

Before I open an issue I will ask if my understanding of C is correct or 
not.

I am a bit confused here...

Regards,
Sturla

















___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-02 Thread Sturla Molden

Yuxiang Wang  wrote:

> 1) @Strula Sorry about my stupid mistake! That piece of code totally
> gave away how green I am in coding C :)

Don't worry. C is a high-level assember. It will bite you again and again,
it happens to everyone. Those who say they have never made a stupid mistake
while coding in C are lying.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-02 Thread Sturla Molden

Yuxiang Wang  wrote:

> 4) I wanted to say that it seems to me, as the project gradually
> scales up, Cython is easier to deal with, especially when I am using a
> lot of numpy arrays. If it is even higher dimensional data, it would
> be verbose while it is really succinct to use Cython.

The easiest way to speed up NumPy code is to use Numba which is an LLVM
based JIT compiler. Numba will often give you performance comparable to C
for free. All you have to do is to add the @numba.jit decorator to your
Python function and possibly include some type hints. If all you want is to
speed up NumPy code, just use Numba and it will take care of it in at least
9 of 10 cases.

Numexpr is also a JIT compiler which can speed up Numpy code, but it does
not give as dramatic results as Numba.

Cython is easier to work with than ctypes, particularly when the problem
scales up. If you use typed memoryviews in Cython you can also avoid having
to work with pointer arithmetics. Cython is mainly a competitior to using
the Python C API manually for C extension modules to Python. Cython also
allows you to wrap external C and C++ code, and e.g. use Python and C++
objects together. The drawback is that you need to learn the Cython
language as well as Python and C and C++ and know how they differ. Cython
also have many of the same hidden dangers as C++, due to the possibility of
exceptions being raised between C statements. But because Cython is the
most flexible tool for writing C extensions to Python you will in the long
run do yourself a favor by learning to use it.

ctypes is good when you have a DLL, possibly form a foreign source, and you
just want to use it without any build step. CFFI is easier to work with
than ctypes and has the same usecase. It can parse C headers and does not
require you to define the C API with Python statements like ctypes do.
Generally I would say it is alway better to use CFFI than ctypes. ctypes is
also currently an orphan, albeit in the Python standard library, while CFFI
is actively maintained. 

Numba will also JIT compile ctypes and CFFI calls to remove the extra
overhead. This is good to know if you need to call a C function in a tight
loop. In that case Numba can JIT compile away the Python as well as the
ctypes/CFFI overhead.

Fortran 90/95 is also underrated. It is easier to work with than C, and
gives similar results performance wise. You can call Fortran with f2py,
ctypes, CFFI, or Cython (use fwrap). Generally I would say that it is
better for a student to learn C than Fortran if you have to choose, because
C is also useful for other things than numerical computing. But if you want
fast and robust numerical code, it is easier to get good results with
Fortran than C or Cython.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] npymath on Windows

2015-01-01 Thread Sturla Molden

On 28/12/14 17:17, David Cournapeau wrote:

> This is not really supported. You should avoid mixing compilers when
> building C extensions using numpy C API. Either all mingw, or all MSVC.

That is not really good enough. Even if we build binary wheels with 
MinGW (see link) the binary npymath library should be useable from MSVC.

https://github.com/numpy/numpy/pull/5328


Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] npymath on Windows

2015-01-01 Thread Sturla Molden

On 28/12/14 01:59, Matthew Brett wrote:

> As far as I can see, 'acosf' is defined in the msvc runtime library.
> I guess that '_acosf' is defined in some mingw runtime library?

AFAIK it is a GCC built-in function. When the GCC compiler or linker 
sees it the binary code will be inlined.

https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Other-Builtins.html

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-01 Thread Sturla Molden

On 01/01/15 19:56, Nathaniel Smith wrote:

> However, I suspect that this question can't really be answered in a
> useful way without more information about why exactly the C code wants
> a **double (instead of a *double) and what it expects to do with it.

> E.g., is it going to throw away the passed in array and return a new
> one?

That is an important question.

The solution I provided only allows a 2D array to be passed in and 
possibly modified inplace. It does not allow the C function pass back a 
freshly allocated array.

The problem is of course that the meaning of double** is ambiguous. It 
could mean a pointer to an array of pointers. But it could also mean a 
double* passed by reference, in which case the function would modify the 
pointer instead of the data it points to.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-01 Thread Sturla Molden

On 01/01/15 20:25, Yuxiang Wang wrote:

> #include 
>
> __declspec(dllexport) void foobar(const int m, const int n, const
> double **x, double **y)
> {
>  size_t i, j;
>  y = (** double)malloc(sizeof(double *) * m);
>  for(i=0; i  y[i] = (*double)calloc(sizeof(double), n);
>  for(i=0; i  for(j=0; j  y[i][j] = x[i][j];
> }

> Was I doing something wrong here?

You are not getting the data back because of the malloc/calloc 
statements. The numpy array y after calling _foobar is still pointing to 
its original buffer, not the new memory you allocated. You just created 
a memory leak. Try this instead:

__declspec(dllexport) void foobar(const int m, const int n, const
double **x, double **y)
{
 size_t i, j;
 for(i=0; ihttp://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-01 Thread Sturla Molden

On 01/01/15 19:00, Yuxiang Wang wrote:

> Could anyone please give any thoughts to help?

Say you want to call "void foobar(int m, int n, double **x)"
from dummy.so or dummpy.dll with ctypes. Here is a fully worked
out example (no tested, but is will work unless I made a typo):


import numpy as np
from numpy.ctypeslib import ndpointer
import ctypes

__all__ = ['foobar']

_doublepp = ndpointer(dtype=np.uintp, ndim=1, order='c')

_dll = ctypes.CDLL('dummpy.so') # or dummpy.dll

_foobar = _dll.foobar
_foobar.argtypes = [ctypes.c_int, ctypes.c_int, _doublepp]
_foobar.restype = None

def foobar(x):
assert(x.flags['C_CONTIGUOUS'])
assert(x.ndim == 2)
xpp = (x.__array_interface__['data'][0]
  + np.arange(x.shape[0])*x.strides[0]).astype(np.uintp)
m = ctype.c_int(x.shape[0])
n = ctype.c_int(x.shape[1])
_foobar(m,n,xpp)



Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-01 Thread Sturla Molden

On 01/01/15 19:30, Sturla Molden wrote:
> You can pretend double** is an array of dtype np.intp. This is because
> on all modern systems, double** has the size of void*, and np.intp is an
> integer with the size of void* (np.intp maps to Py_intptr_t).

Well, it also requires that the user space is the lower half of the 
address space, which is usually true.

But to be safe against this you should use np.uintp instead of np.intp:

xpp = (x.__array_interface__['data'][0]
 + np.arange(x.shape[0])*x.strides[0]).astype(np.uintp)

doublepp = np.ctypeslib.ndpointer(dtype=np.uintp)



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Pass 2d ndarray into C **double using ctypes

2015-01-01 Thread Sturla Molden

You can pretend double** is an array of dtype np.intp. This is because 
on all modern systems, double** has the size of void*, and np.intp is an 
integer with the size of void* (np.intp maps to Py_intptr_t).

Now you just need to fill in the adresses. If you have a 2d ndarray in C 
order, or at least one which is contiguous along the second dimension, 
you set up the array of double* like so:

xpp = (x.__array_interface__['data'][0]
+ np.arange(x.shape[0])*x.strides[0]).astype(np.intp)

(The last cast to np.intp is probably only required on Windows 64 with 
older NumPy versions, but it never hurts.)

Next, make a dtype that corresponds to this Py_intptr_t array:

doublepp = np.ctypeslib.ndpointer(dtype=np.intp)

Declare your function with doublepp instead of ndpointer with dtype 
np.double, and pass xpp instead of the 2d array x.

Sturla

On 01/01/15 19:00, Yuxiang Wang wrote:
> Dear all,
>
> I am currently using a piece of C code, where one of the input
> argument of a function is **double.
>
> So, in numpy, I tried np.ctypeslib.ndpointer(ctypes.c_double), but
> obviously this wouldn't work because this is only *double, not
> **double.
>
> Then I tried np.ctypeslib.ndpointer(np.ctypeslib.ndpointer(ctypes.c_double)),
> but this didn't work either because it says "ArgumentError: argument
> 4: : array must have data type uint64
> ".
>
> np.ctypeslib.ndpointer(ctypes.c_double, ndim=2) wound't work too,
> because **double is not the same with *double[].
>
> Could anyone please give any thoughts to help?
>
> Thanks,
>
> Shawn
>

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fast sizes for FFT

2014-12-24 Thread Sturla Molden

On 24/12/14 14:34, Sturla Molden wrote:
> I would rather have SciPy implement this with the overlap-and-add method
> rather than padding the FFT. Overlap-and-add is more memory efficient
> for large n:

(eh, the list should be)


- Overlap-and-add is more memory efficient for large n.

- It scales as O(n) instead of O(n log n).

- For short FIR filters overlap-and-add also allows us to use small
radix-2 FFTs.

- Small FFT size also means that we can use a small Winograd FFT instead 
of Cooley-Tukey FFT, which reduces the number of floating point
multiplications.

- A small look-up table is also preferable as it can be kept in cache.

- Overlap-and-add is also trivial to compute in parallel. This comes at
the expense of using more memory, but it never requires more memory than 
just taking a long FFT.



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fast sizes for FFT

2014-12-24 Thread Sturla Molden

On 24/12/14 04:33, Robert McGibbon wrote:

> Alex Griffing pointed out on github that this feature was recently added
> to scipy in https://github.com/scipy/scipy/pull/3144. Sweet!

I would rather have SciPy implement this with the overlap-and-add method 
rather than padding the FFT. Overlap-and-add is more memory efficient 
for large n:

- It scales as O(n) instead of O(n log n).

- For short FIR filters overlap-and-add also allows us to use small 
radix-2 FFTs.

- Small FFT size also means that we can use a small Winograd FFT instead 
of Cooley-Tukey FFT, which reduces the number of floating point 
multiplications.

- A small look-up table is also preferable as it can be kept in cache.

- Overlap-and-add is also trivial to compute in parallel. This comes at 
the expense of using more memory, but it never requires more memory than 
just taking a long FFT.


This is also interesting:

https://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/



Sturla





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fast sizes for FFT

2014-12-24 Thread Sturla Molden

On 24/12/14 13:23, Julian Taylor wrote:

> hm this is a brute force search, probably fast enough but slower than
> scipy's code (if it also were cython)

That was what I tought as well when I wrote it. But it turned out that 
regular numbers are so close and abundant that was damn fast, even in 
Python :)

> I also ported it to C a while back, so that could be used for numpy if
> speed is an issue. Code attached.

Very nice :)


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fast sizes for FFT

2014-12-24 Thread Sturla Molden

On 24/12/14 13:07, Sturla Molden wrote:

> v
>
> cdef intp_t checksize(intp_t n):
>   while not (n % 5): n /= 5
>   while not (n % 3): n /= 3
>   while not (n % 2): n /= 2
>   return (1 if n == 1 else 0)
>
> def _next_regular(target):
>   cdef intp_t n = target
>   while not checksize(n):
>   n += 1
>   return n


Blah, old code, with current Cython this should be:


from numpy cimport intp_t
cimport cython

@cython.cdivision(True)
cdef intp_t checksize(intp_t n):
  while not (n % 5): n //= 5
  while not (n % 3): n //= 3
  while not (n % 2): n //= 2
  return (1 if n == 1 else 0)

def _next_regular(target):
  cdef intp_t n = target
  while not checksize(n):
  n += 1
  return n



Sturla





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fast sizes for FFT

2014-12-24 Thread Sturla Molden

On 24/12/14 04:33, Robert McGibbon wrote:
> Alex Griffing pointed out on github that this feature was recently added
> to scipy in https://github.com/scipy/scipy/pull/3144. Sweet!


I use different padsize search than the one in SciPy. It would be 
interesting to see which is faster.


from numpy cimport intp_t

cdef intp_t checksize(intp_t n):
 while not (n % 5): n /= 5
 while not (n % 3): n /= 3
 while not (n % 2): n /= 2
 return (1 if n == 1 else 0)

def _next_regular(target):
 cdef intp_t n = target
 while not checksize(n):
 n += 1
 return n



Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-12 Thread Sturla Molden

Chris Barker  wrote:

> Anyway, the point is that if we wanted this to be a
> used-more-than-very-rarely in only very special cases feature de-allocating
> an array's data buffer), then ndarray would need to grow a check for an
> invalid buffer on access.

One would probably need something like Java JNI where an array can be
"locked" by the C code. 

But nevertheless, what I suggested is not inherently worse than this C++
code:

{
std::vector a(n);
// do something with a
}
// references to std::vector a might still exist

In C this is often known as the "dangling pointer" problem.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] error: no matching function for call to 'PyArray_DATA'

2014-12-11 Thread Sturla Molden

Jack Howarth  wrote:

> What is the correct coding to eliminate this error? I have found some
> threads which seems to suggest that PyArray_DATA is still available in
> numpy 1.9 as an inline but I haven't found any examples of projects
> patching their code to convert to that usage.

In the deprecated API, PyArray_DATA is a macro. In the new API,
PyArray_DATA is an inline function. While almost idential from the
perspective of user code, the inline function has an argument with a type.
Judging from the error message, the problem is that your NumPy array is
represented by PyObject* instead of PyArrayObject*, and your C compiler
says it cannot do the conversion automatically. The old macro version does
the typecast, so it does not matter if you give it PyObject*. Solution? Use
PyArrayObject* consistently in your code or typecast the PyObject* pointer
on each function call to PyArray_DATA.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-10 Thread Sturla Molden

Chris Barker  wrote:

> I haven't managed to trigger a segfault yet but it sure looks like I
> could...

You can also trigger random errors. If the array is small, Python's memory
mamager might keep the memory in the heap for reuse by PyMem_Malloc. And
then you can actually modify some random Python object with bogus bits. It
can screw things up even if you do not see a segfault.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-09 Thread Sturla Molden

Nathaniel Smith  wrote:

> This should be pretty trivial to implement. AFAICT you don't need any
> complicated cython

I have a bad habit of thinking in terms of too complicated C instead of
just using NumPy.


> @contextmanager
> def tmp_zeros(*args, **kwargs):
> arr = np.zeros(*args, **kwargs)
> try:
> yield arr
> finally:
> arr.resize((0,), check_refs=False)
> 
> Given how intrinsically dangerous this is, and how easily it can be
> implemented using numpy's existing public API, I think maybe we should
> leave this for third-party daredevils instead of implementing it in numpy
> proper.

It seems so :-)


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-09 Thread Sturla Molden

Nathaniel Smith  wrote:

> @contextmanager
> def tmp_zeros(*args, **kwargs):
> arr = np.zeros(*args, **kwargs)
> try:
> yield arr
> finally:
> arr.resize((0,), check_refs=False)

That one is interesting. I have actually never used ndarray.resize(). It
did not even occur to me that such an abomination existed :-)

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-09 Thread Sturla Molden

Chris Barker  wrote:

> my  first thought iust that you can just do:
> 
> x = np.zeros(n)
> [... your code here ]
> del x
> 
> x's ref count will go down, and it will be deleted  if there are no other
> references to it. 

1. This depends on reference counting. PyPy supports numpy too (albeit with
its own code) and does not reference count.

2. del does not delete, it just decrements the refcount. x can still be
kept alive,

3. If x is a part of a reference cycle it is reclaimed later on.

> If there Are other references to it, you really wouldn't
> want to delete the memory buffer anyway, would you?

Same thing for file descriptors. For example consider what happens if you
memory map a file, then close the file, but continue to read and write to
the mapped address. NumPy allows us to construct these circumstances if we
want to.

> I suppose you could write a generic context manger that would do the del
> for you, but I'm not sure what the point would be.

A del is very different from a deallocation that actually disposes of the
data buffer, regardless of references to the memory that might still be
alive.

> I guess this comes down to -- why would anyone want/need a numpy array
> object with no underlying data?

I don't. The PyArrayObject struct is so small that I don't care about it.
But it could reference a huge data buffer, and I might want to get rid of
that more deterministically than just waiting for the gc.

> (although I'm still confused as to why it's so important (in cPython) to
> have a file context manager..)

Because we often want to run setup and teardown code deterministically,
rather than e.g. having it happen at random from the gc thread when it runs
the finalizer. If Python raises an exception, a io.file object can be kept
alive by the traceback for decades. If Python raises an exception, a an
acquire/release pair for a threading.Lock can be separated, and the lock
ends up in an undefined state further down in your code.

In what I suggested the setup and teardown code would be malloc() and
free().

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should ndarray be a context manager?

2014-12-09 Thread Sturla Molden

On 09/12/14 18:39, Julian Taylor wrote:

> A context manager will also not help you with reference cycles.

If will because __exit__ is always executed. Even if the PyArrayObject 
struct lingers, the data buffer will be released.

Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Should ndarray be a context manager?

2014-12-09 Thread Sturla Molden


I wonder if ndarray should be a context manager so we can write 
something like this:


with np.zeros(n) as x:
   [...]


The difference should be that __exit__ should free the memory in x (if 
owned by x) and make x a zero size array.

Unlike the current ndarray, which does not have an __exit__ method, this 
would give precise control over when the memory is freed. The timing of 
the memory release would not be dependent on the Python implementation, 
and a reference cycle or reference leak would not accidentally produce a 
memory leak. It would allow us to deterministically decide when the 
memory should be freed, which e.g. is useful when we work with large arrays.


A problem with this is that the memory in the ndarray would be volatile 
with respect to other Python threads and view arrays. However, there are 
dozens of other ways to produce segfaults or buffer overflows with NumPy 
(cf. stride_tricks or wrapping external buffers).


Below is a Cython class that does something similar, but we would need 
to e.g. write something like

 with Heapmem(n * np.double().itemsize) as hm:
 x = hm.doublearray
 [...]

instead of just

 with np.zeros(n) as x:
 [...]


Sturla


# (C) 2014 Sturla Molden

from cpython cimport PyMem_Malloc, PyMem_Free
from libc.string cimport memset
cimport numpy as cnp
cnp.init_array()


cdef class Heapmem:

 cdef:
 void *_pointer
 cnp.intp_t _size

 def __cinit__(Heapmem self, Py_ssize_t n):
 self._pointer = NULL
 self._size =  n

 def __init__(Heapmem self, Py_ssize_t n):
 self.allocate()

 def allocate(Heapmem self):
 if self._pointer != NULL:
 raise RuntimeError("Memory already allocated")
 else:
 self._pointer = PyMem_Malloc(self._size)
 if (self._pointer == NULL):
 raise MemoryError()
 memset(self._pointer, 0, self._size)

 def __dealloc__(Heapmem self):
 if self._pointer != NULL:
 PyMem_Free(self._pointer)
 self._pointer = NULL

 property pointer:
 def __get__(Heapmem self):
 return  self._pointer

 property doublearray:
 def __get__(Heapmem self):
 cdef cnp.intp_t n = self._size//sizeof(double)
 if self._pointer != NULL:
 return cnp.PyArray_SimpleNewFromData(1, &n,
  cnp.NPY_DOUBLE, self._pointer)
 else:
 raise RuntimeError("Memory not allocated")

 property chararray:
 def __get__(Heapmem self):
 if self._pointer != NULL:
 return cnp.PyArray_SimpleNewFromData(1, &self._size,
  cnp.NPY_CHAR, self._pointer)
 else:
 raise RuntimeError("Memory not allocated")

 def __enter__(self):
 if self._pointer != NULL:
 raise RuntimeError("Memory not allocated")

 def __exit__(Heapmem self, type, value, traceback):
 if self._pointer != NULL:
 PyMem_Free(self._pointer)
 self._pointer = NULL





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] g77

2014-12-01 Thread Sturla Molden

Charles R Harris  wrote:

> Is there any reason to keep support for g77? The last release was in 2006
> and gfortran has been available since 2005. I admit that there is no reason
> to drop current support, apart from getting rid of some code, so perhaps
> the question should be: how much work should we spend maintaining it?

Until we have Carl Kleffner's gfortran toolchain ready the Windows build of
SciPy depends on it.

Accelerate Framework on MacOS X also uses g77 ABI.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Scipy 0.15.0 beta 1 release

2014-11-25 Thread Sturla Molden

David Cournapeau  wrote:
> Shall we consider  href="https://github.com/scipy/scipy/issues/4168";>https://github.com/scipy/scipy/issues/4168
> to be a
> blocker (the issue arises on scipy master as well as 0.14.1) ?
> 

It is really bad, but does anyone know what is really going on?

Which changes to NumPy set this off?

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Initializing array from buffer

2014-11-21 Thread Sturla Molden

On 18/11/14 04:21, Robert McGibbon wrote:
> The np.ndarray constructor
>  takes
> a strides argument argument, and a buffer. Is it not sufficiently flexible?
>
> -Robert

AFAIK the buffer argument is not a memory address but an object 
exporting the old buffer protocol. We can abuse the __array_interface__ 
to do this though, but I prefer the C API functions. Wrapping a C 
pointer with __array_interface__ then becomes something like this (not 
tested, but should work):


import numpy as np

cdef class wrapper_array(object):

 cdef:
 object readonly __array_interface__

 def __init__(wrapper_array self, addr, shape, dtype,
order, strides, offset):
 if strides is None:
 if order == 'C':
 strides = None
 else:
 strides = _get_fortran_strides(shape, dtype)
 self.__array_interface__ = dict(
 data = (addr + offset, False),
 descr = dtype.descr,
 shape = shape,
 strides = strides,
 typestr = dtype.str,
 version = 3,
 )

cdef object _get_fortran_strides(shape, dtype):
 strides = tuple(dtype.itemsize * np.cumprod((1,) + shape[:-1]))
 return strides

def wrap_pointer(void *addr, shape, dtype, order, strides, offset):
 """Wraps a C pointer with an ndarray"""
 return np.asarray(wrapper_array( addr, shape,
dtype, order, strides, offset))


https://github.com/sturlamolden/sharedmem-numpy/blob/master/sharedmem/array.py



Sturla






___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy 1.9.1, zeros and alignement

2014-11-19 Thread Sturla Molden

On 18/11/14 23:10, Pauli Virtanen wrote:

> The second question is whether F2py actually *needs* to check the
> dtype-size alignment, or is just something like sizeof(double) enough
> for Fortran compilers.

The Fortran standard does not specify an ABI so this is likely compiler 
dependent.

The only safe way of calling Fortran from C is through a proxy Fortran 
routine exported to C using Fortran 2003 ISO C bindings. That is what 
f2py should do behind the scenes now that the Fortran 77 limitation is 
lifted from SciPy.

With this method we can just pass an arbitrary C pointer to Fortran and 
construct a Fortran array pointer with c_f_function. The Fortran 
compiler will make an aligned copy of the array referenced by the 
pointer if required. No special coding will be required to ensure 
correct alignment.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Initializing array from buffer

2014-11-17 Thread Sturla Molden

Andrea Arteaga  wrote:

> My use case is the following: we have a some 3D arrays in our C++
> framework. The ordering of the elements in these arrays is neither C nor
> Fortran style: it might be IJK (i.e. C style, 3rd dimension contiguous in
> memory), KJI (i.e. Fortran style, first dimension contiguous) or, e.g. IKJ.
> Moreover we put some padding to optimize aligned access. This kind of
> memory structure cannot be just expressed as 'C' or 'Fortran', but it can
> be perfectly expressed using the Python buffer protocol by providing the
> shape and the strides. We would like to export this structure to a numpy
> array that should be able of accessing the same memory locations in a
> consistent way and make some operations like initializing the content or
> plotting it.
> 
> Is this currently possible?
> If not, is it planned to implement such a feature?

If you are already coding in C++, just use PyArray_New or
PyArray_NewFromDescr:

http://docs.scipy.org/doc/numpy/reference/c-api.array.html#c.PyArray_New
http://docs.scipy.org/doc/numpy/reference/c-api.array.html#c.PyArray_NewFromDescr

Apart from that, numpy.array and numpy.asarray can also accept a PEP 3118
buffer. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-30 Thread Sturla Molden

Nathaniel Smith  wrote:

>> [*] Actually, we could, but the binaries would be tainted with a viral
>> license.
> 
> And binaries linked with MKL are tainted by a proprietary license... They
> have very similar effects, 

The MKL license is proprietary but not viral. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-29 Thread Sturla Molden

On 29/10/14 10:48, Eelco Hoogendoorn wrote:

 > Id rather have us discuss how to facilitate the integration of
 > as many possible fft libraries with numpy behind a maximally uniform
 > interface, rather than having us debate which fft library is 'best'.

I am happy with the NumPy interface. There are minor differences between 
the SciPy and NumPy FFT interfaces (e.g. for rfft, see below). 
Personally I prefer the NumPy interface because it makes it easier to 
map Fourier coeffs to frequencies.

One thing we could do, without too much hassle, is to use FFTs from MKL 
or Accelerate (vDSP) if we link with these libararies for BLAS/LAPACK.

MKL has an API compatible with FFTW, so FFTW and MKL can be supported 
with the same C code. FFTW and MKL also have a Fortran 77 API which we 
could wrap with f2py (no Fortran compiler are needed). It is actually 
possible to use the FFTs in FFTW and MKL from Python without any C 
coding at all. We just need to add a Python interface on top of the f2py 
wrappers, which is similar to what we do for scipy.linalg.

The FFTs in Accelerate have a very simple C interface, but only support 
power-of-two array sizes, so we would need to use them with Bluestein's 
algorithm. Again, because of their simplicity, it is possible to wrap 
these FFT functions with f2py.

We cannot bundle NumPy or SciPy binaries with FFTW due to GPL [*], but 
as I understand it we already have permission from Intel to bundle 
binary wheels linked with MKL. Accelerate is a system library, so that 
does not pose a license problem.

[*] Actually, we could, but the binaries would be tainted with a viral 
license.

 >>> a = np.random.rand(8)

 >>> scipy.fftpack.rfft(a)[:,None]

array([[ 3.47756851],
[-0.45869926],
[-0.21730867],
[-0.43763425],
[-0.67338213],
[-0.28799   ],
[ 0.17321793],
[-0.31514119]])

 >>> np.fft.rfft(a)[:,None]

array([[ 3.47756851+0.j],
[-0.45869926-0.21730867j],
[-0.43763425-0.67338213j],
[-0.28799000+0.17321793j],
[-0.31514119+0.j]])

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Sturla Molden

Eelco Hoogendoorn  wrote:

> Perhaps the 'batteries included' philosophy made sense in the early days of
> numpy; but given that there are several fft libraries with their own pros
> and cons, and that most numpy projects will use none of them at all, why
> should numpy bundle any of them?

Because sometimes we just need to compute a DFT, just like we sometimes
need to compute a sine or an exponential. It does that job perfectly well.
It is not always about speed. Just typing np.fft.fft(x) is convinient. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] multi-dimensional c++ proposal

2014-10-28 Thread Sturla Molden

Neal Becker  wrote:

> That's harsh!  Do you have any specific features you dislike?  Are you 
> objecting 
> to the syntax?

I have programmed C++ for almost 15 years. But I cannot look at the
proposed code an get a mental image of what it does. It is not a specific
feature, but how the code looks in general. This is e.g. not a problem with
Eigen or Blitz, if you know C++ it is not particularly hard to read. Not as
nice as Fortran or Cython, but ut is still not too bad. Boost multiarray
suffers from not being particularly readable, however, but this proposal is
even worse. I expect that scientists and engineers will not use an
unreadable array API. When we write or maintain numerical algorithms we
need to get a mental image of the code, because we actually spend most of
the time looking at or reading the code.

I agree that C++ needs multidimensional arrays in the STL, but this
proposal will do more harm than good. In particular it will prevent
adoption of a usable array API. And as consequence, it will fuel the
problem it is trying to solve: C++ programmers will still used homebrewed
multiarray classes, because there is no obvious replacement in the standard
library.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-28 Thread Sturla Molden

Pierre Barbier de Reuille  wrote:

> I would add one element to the discussion: for some (odd) reasons, SciPy is
> lacking the functions `rfftn` and `irfftn`, functions using half the memory
> space compared to their non-real equivalent `fftn` and `ifftn`. 

In both NumPy and SciPy the N-dimensional FFTs are implemented in Python.
It is just a Python loop over all the axes, calling fft or rfft on each
axis.

> However, I
> haven't (yet) seriously tested `scipy.fftpack.fftn` vs. `np.fft.rfftn` to
> check if there is a serious performance gain (beside memory usage).

Real-value FFT is implemented with complex-value FFT. You save half the
memory, but not quite half the computation. Apart from that, the FFT in
SciPy is written in Fortran and the FFT in NumPy is written in C, but they
are algorithmically similar. I don't see any good reason why the Fortran
code in SciPy should be faster than the C code in NumPy. It used to be the
case that Fortran was "faster than C", everything else being equal, but
with modern C compilers and CPUs with deep pipelines and branch prediction
this is rarely the case. So I would expect the NumPy rfftn to be slightly
faster than SciPy fftn, but keep in mind that both have a huge Python
overhead.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Sturla Molden

David Cournapeau  wrote:

> The real issue with fftw (besides the license) is the need for plan
> computation, which are expensive (but are not needed for each transform).

This is not a problem if you thell FFTW to guess a plan instead of making
measurements. FFTPACK needs to set up a look-up table too.

> I made some experiments with the Bluestein transform to handle prime
> transforms on fftpack, but the precision seemed to be an issue. Maybe I
> should revive this work (if I still have it somewhere).

You have it in a branch on Github.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] FFTS for numpy's FFTs (was: Re: Choosing between NumPy and SciPy functions)

2014-10-28 Thread Sturla Molden

Jerome Kieffer  wrote:

> Because the plan creation was taking ages with FFTw, numpy's FFTPACK was
> often faster (overall)

Matlab switched from FFTPACK to FFTW because the latter was faster in
general. If FFTW guesses a plan it does not take very long. Actual
measurements can be slow, however, but those are not needed. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.i and std::complex

2014-10-28 Thread Sturla Molden

Robert Kern  wrote:
> The polite, welcoming
> response to someone coming along with a straightforward,
> obviously-correct contribution to our SWIG capabilities is "Thank
> you!", not "perhaps you overestimate the number of NumPy users who use
> Swig". 

That was a response to something else. As to why this issue with NumPy and
Swig has not been solved before, the OP suggested he might have
overestimated the number of NumPy users who also use std::complex in C++.
Hence my answer he did not (it is arguably not that uncommon), but maybe
they don't use Swig.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-27 Thread Sturla Molden

Matthew Brett  wrote:

> Is this an option for us?  Aren't we a little behind the performance
> curve on FFT after we lost FFTW?

It does not run on Windows because it uses POSIX to allocate executable
memory for tasklets, as i understand it.

By the way, why did we loose FFTW, apart from GPL? One thing to mention
here is that MKL supports the FFTW APIs. If we can use MKL for linalg and
numpy.dot I don't see why we cannot use it for FFT.

On Mac there is also vDSP in Accelerate framework which has an insanely
fast FFT (also claimed to be faster than FFTW). Since it is a system
library there should be no license problems.

There are clearly options if someone wants to work on it and maintain it.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-27 Thread Sturla Molden

 wrote:

https://github.com/scipy/scipy/blob/e758c482efb8829685dcf494bdf71eeca3dd77f0/scipy/signal/signaltools.py#L13";>https://github.com/scipy/scipy/blob/e758c482efb8829685dcf494bdf71eeca3dd77f0/scipy/signal/signaltools.py#L13
>doesn't seem to mind mixing numpy and scipy  (quick github search)

I believe it is because NumPy's FFTs (beginning with 1.9.0) are
thread-safe. But FFTs from numpy.fft and scipy.fftpack should be rather
similar in performance. (Except if you use Enthought, in which case the
former is much faster.)

It seems from the code that fftconvolve does not use overlap-add or
overlap-save. I seem to remember that it did before, but I might be wrong.
Personally I prefer to use overlap-add instead of a very long FFT. 

There is also a scipy.fftpack.convolve module. I have not used it though.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-27 Thread Sturla Molden

Sturla Molden  wrote:

> If we really need a
> kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or
> Apple's Accelerate Framework, 

I should perhaps also mention FFTS here, which claim to be faster than FFTW
and has a BSD licence:

http://anthonix.com/ffts/index.html

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Choosing between NumPy and SciPy functions

2014-10-27 Thread Sturla Molden

 wrote:

> For fft I use mostly scipy, IIRC.   (scipy's fft imports numpy's fft,
> partially?)

No. SciPy uses the Fortran library FFTPACK (wrapped with f2py) and NumPy
uses a smaller C library called fftpack_lite. Algorithmically they are are
similar, but fftpack_lite has fewer features (e.g. no DCT). scipy.fftpack
does not import numpy.fft. Neither of these libraries are very "fast", but
usually they are "fast enough" for practical purposes. If we really need a
kick-ass fast FFT we need to go to libraries like FFTW, Intel MKL or
Apple's Accelerate Framework, or even use tools like CUDA or OpenCL to run
the FFT on the GPU. But using such tools takes more coding (and reading API
specifications) than the convinience of just using the FFTs already in
NumPy or SciPy. So if you count in your own time as well, it might not be
that FFTW or MKL are the "faster" FFTs.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.i and std::complex

2014-10-27 Thread Sturla Molden

Robert Kern  wrote:

> Please stop haranguing the new guy for not knowing things that you
> know. 

I am not doing any of that. You are the only one haranguing here. I usually
ignore your frequent inpolite comments, but I will do an exception this
time and ask you to shut up.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] multi-dimensional c++ proposal

2014-10-27 Thread Sturla Molden

On 27/10/14 13:14, Neal Becker wrote:
> The multi-dimensional c++ stuff is interesting (about time!)
>
> http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3851.pdf

OMG, that API is about as awful as it gets. Obviously it is written by 
two computer scientists who do not understand what scientific and 
technical computing actually needs. There is a reason why many 
scientists still prefer Fortran to C++, and I think this proposal shows 
us why. An API like that will never be suitable for implementing complex 
numerical alorithms. It will fail horribly because it is *not readable*. 
I have no doubt it will be accepted though, because the C++ standards 
committee tends to accept unusable things.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [EXTERNAL] Re: numpy.i and std::complex

2014-10-27 Thread Sturla Molden

Bill Spotz  wrote:
> Oops, I meant '"Cython" is its own language,' not "Python" (although
> Python qualifies, too, just not in context).
> 
> Also, Pyrex, listed in the c-info.python-as-glue.html page, was the 
> pre-cursor to Cython.

But when it comes to interfacing NumPy, they are really not comparable. :-)

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.i and std::complex

2014-10-27 Thread Sturla Molden

Glen Mabey  wrote:

> I chose swig after reviewing the options listed here, and I didn't see cython 
> on the list:
> 
> http://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html

It's because that list is old and has not been updated. It has the
predecessor to Cython, Pyrex, but they are very different now.

Both SciPy and NumPy has Cython as a build dependency, and also projects
like scikit-learn, scikit-image, statsmodels. 

If you find C++ projects which use Swig (wxPython, PyWin32) or SIP (PyQt)
it is mainly because they are older than Cython. A more recent addition,
PyZMQ, use Cython to wrap C++.

> I guess that's because cython is different language, right?  So, if I
> want to interactively call C++ functions from say ipython, then is cython 
> really an option?

You can use Cython to call C++ functions in ipython and ipython notebook.

cythonmagic takes care of that :-)

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.i and std::complex

2014-10-27 Thread Sturla Molden

Glen Mabey  wrote:

> I'd really like for this to be included alongside numpy.i -- but maybe I
> overestimate the number of numpy users who use complex data (let your
> voice be heard!) and who also end up using std::complex in C++ land.

I don't think you do. But perhaps you overestimate the number of NumPy
users who use Swig?

Cython seems to be the preferred wrapping tool today, and it understands
complex numbers:

cdef double complex J = 0.0 + 1j

If you tell Cython to emit C++, this will result in code that uses
std::complex. 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] parallel compilation with numpy.distutils in numpy 1.10

2014-10-10 Thread Sturla Molden

Julian Taylor  wrote:

> thanks for the explanation.
> Modules are only available with f90 right? f77 files do not have these
> generated interdependencies?
> being able to handle f77 would already be quite good, as it should at
> least cover current scipy.
> One can look at a smarter scheme for f90 later.

Yes, modules are only for f90 and later. f77 do not have modules.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] parallel compilation with numpy.distutils in numpy 1.10

2014-10-10 Thread Sturla Molden

Sturla Molden  wrote:

> So the Fortran 90 files creates a directed asyclic graph. To compute in
> parallel

Eh, *compile* in parallel.


> Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] parallel compilation with numpy.distutils in numpy 1.10

2014-10-10 Thread Sturla Molden

Sturla Molden  wrote:

> When a Fortran module is compiled, the compiler emits an object file (.o)
> and a module file (.mod). The module file plays the role of a header file
> in C. So when another Fortran file imports the module with a use statement,
> the compiler looks for the module file. Because the .mod file is generated
> by the compiler, unlike the .h file in C, the ordering of compilation is
> more critical in Fortran 90 than in C. If B.f90 has a "use A" statement,
> then A.f90 must be compiled before B.f90. CMake has an intelligent system
> for working out the correct order of compilation of Fortran 90 files.

So the Fortran 90 files creates a directed asyclic graph. To compute in
parallel one might use a set of coroutines, one for each f90 file, and then
yield from the downstream files in the graph. But on the other hand, it
might not be worth the effort. ;-)


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] parallel compilation with numpy.distutils in numpy 1.10

2014-10-10 Thread Sturla Molden

Julian Taylor  wrote:

> There is still one problem in regards to parallelizing fortran 90. The
> ccompiler.py contains following comment:
> # build any sources in same order as they were originally specified
> #   especially important for fortran .f90 files using modules
> 
> This indicates the f90 builds cannot be trivially parallelized. I do not
> know much fortran, can someone explain to me when ordering of single
> file compiles is an issue in f90?

Sure :)

When a Fortran module is compiled, the compiler emits an object file (.o)
and a module file (.mod). The module file plays the role of a header file
in C. So when another Fortran file imports the module with a use statement,
the compiler looks for the module file. Because the .mod file is generated
by the compiler, unlike the .h file in C, the ordering of compilation is
more critical in Fortran 90 than in C. If B.f90 has a "use A" statement,
then A.f90 must be compiled before B.f90. CMake has an intelligent system
for working out the correct order of compilation of Fortran 90 files.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-09 Thread Sturla Molden

Travis Oliphant  wrote:

> A good mingw64 stack for Windows would be great and benefits many
> communities.

BTW: Carl Kleffners mingw toolchains are here:

Documentation:
https://github.com/numpy/numpy/wiki/Mingw-static-toolchain

Downloads:
https://bitbucket.org/carlkl/mingw-w64-for-python/downloads

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-09 Thread Sturla Molden

Travis Oliphant  wrote:

> A good mingw64 stack for Windows would be great and benefits many
> communities.

Carl Kleffner has made 32- and 64-bit mingw stacks compatible with Python.
E.g. the stack alignment in the 32-bit version is different from the
vanilla mingw distribution. It also, for the first time, allow us to build
SciPy with gfortran instead of g77 on Windows, which means we don't have to
limit the Fortran code in SciPy to Fortran 77 legacy code.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Copyright status of NumPy binaries on Windows/OS X

2014-10-08 Thread Sturla Molden

Travis Oliphant  wrote:

> Microsoft has actually released their Visual Studio 2008 compiler stack so
> that OpenBLAS and ATLAS could be compiled on Windows for these platforms as
> well.   I would be very interested to see conda packages for these
> libraries which should be pretty straightforward to build.

OpenBLAS does not compile with Microsoft compilers because of AT&T assembly
syntax. You need to use a GNU compiler and you also need to have a GNU
environment. OpenBLAS is easy to build on Windows with MinGW (with
gfortran) and MSYS. Carl's toolchain ensures that the binaries are
compatible with the Python binaries from Python.org.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] skip samples in random number generator

2014-10-02 Thread Sturla Molden

Robert Kern  wrote:

> Yes, but that would require rewriting much of numpy.random to allow
> replacing the core generator. This would work out-of-box because it's
> just manipulating the state of the current core generator.

Yes, then we just need to sacrifice a year's worth of CPU time, and a PR
will be ready by next fall ;-)

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] skip samples in random number generator

2014-10-02 Thread Sturla Molden

Robert Kern  wrote:

> No one needs small jumps of arbitrary size. The real use case for
> jumping is to make N parallel streams that won't overlap. You pick a
> number, let's call it `jump_steps`, much larger than any single run of
> your system could possibly consume (i.e. the number of core PRNG
> variates pulled is << `jump_steps`). Then you can initializing N
> parallel streams by initializing RandomState once with a seed, storing
> that RandomState, then jumping ahead by `jump_steps`, storing *that*
> RandomState, by `2*jump_steps`, etc. to get N RandomState streams that
> will not overlap. Give those to your separate processes and let them
> run.
> 
> So the alternative may actually be to just generate and distribute
> *one* set of these jump coefficients for a really big jump size but
> still leaves you enough space for a really large number of streams
> (fortunately, 2**19937-1 is a really big number).


DCMT might be preferred in this case. It works the same, except you have N
"random state" streams with characteristic polynomials that are distinct
and relatively prime to each other. Thus each of the N processes will get
an independent stream of random numbers, without any chance of overlap.

http://www.math.sci.hiroshima-u.ac.jp/∼m-mat/MT/DC/dc.html

Jump-ahead is easier to accomplish with MRG-32k3a than MT19937. Another
generator with an efficient jump-ahead is XORWOW.



Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] @ operator

2014-09-10 Thread Sturla Molden

Charles R Harris  wrote:

> Note also that the dot cblas versions are not generally blocked, so the
> size of the arrays is limited (and not checked).

But it is possible to create a blocked dot function with the current cblas,
even though they use C int for array dimensions. It would just further
increase the complexity of dot (as if it's not bad enough already...) 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] SFMT (faster mersenne twister)

2014-09-10 Thread Sturla Molden

Julian Taylor  wrote:

> But as already mentioned by Robert, we know what we can do, what is
> missing is someone writting the code.

This is actually a part of NumPy I know in detail, so I will be able to
contribute. Robert Kern's last post about objects like np.random.SFMT()
working similar to RandomState should be doable and not break any backwards
compatibility.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] SFMT (faster mersenne twister)

2014-09-10 Thread Sturla Molden

Pierre-Andre Noel  wrote:

> Why not do something like the C++11 ? In , a "generator" 
> is the engine producing randomness, and a "distribution" decides what is 
> the type of outputs that you want. 

This is what randomkit is doing internally, which is why it is so easy to
plug in a different generator.


Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] SFMT (faster mersenne twister)

2014-09-09 Thread Sturla Molden

On 09/09/14 20:08, Nathaniel Smith wrote:

> There's also another reason why generator decisions should be part of
> the RandomState object itself: we may want to change the distribution
> methods themselves over time (e.g., people have been complaining for a
> while that we use a suboptimal method for generating gaussian deviates),
> but changing these creates similar backcompat problems.

Which one should we rather use? Ziggurat?

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

< 1 2 3 4 5 6 7 8 9 >

101 - 200 of 840 matches

Mail list logo