Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
> Sturla Molden wrote: > IMO there's a problem with using literal variable names here, because > Python syntax implies that the value is passed. One shouldn't make > syntax where private=(i,) is legal but private=(f(),) isn't. The latter would be illegal in OpenMP as well. OpenMP pragmas only tak

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Brian Granger
> You know, I thought of the exact same thing when reading your post. No, > you need the GIL currently, but that's something I'd like to fix. > > Ideally, it would be something like this: > > cdef int i, s = 0, n = ... > cdef np.ndarray[int] arr = ... # will require the GIL > with nogil: >for i

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Dag Sverre Seljebotn
Brian Granger wrote: > And a question: > > With the new Numpy support in Cython, does Cython release the GIL if > it can when running through through loops over numpy arrays? Does > Cython call into the C API during these sections? You know, I thought of the exact same thing when reading your pos

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Brian Granger
Wow, interesting thread. Thanks everyone for the ideas. A few more comments: GPUs/CUDA: * Even though there is a bottleneck between main memory and GPU memory, as Nathan mentioned, the much larger memory bandwidth on a GPU often makes GPUs great for memory bound computations...as long as you ca

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Brian Granger
> At any rate, I really like the OpenMP approach and prefer to have > support for it in Cython much better than threading, MPI or whatever. > But the thing is: is OpenMP stable, mature enough for allow using it in > most of common platforms? I think that recent GCC compilers support > the latest i

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Brian Granger
> Recent Matlab versions use Intels Math Kernel Library, which performs > automatic multi-threading - also for mathematical functions like sin > etc, but not for addition, multiplication etc. It seems to me Matlab > itself does not take care of multi-threading. On > http://www.intel.com/software/p

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Brian Granger
> If your problem is evaluating vector expressions just like the above > (i.e. without using transcendental functions like sin, exp, etc...), > usually the bottleneck is on memory access, so using several threads is > simply not going to help you achieving better performance, but rather > the contr

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn wrote: > Hmm... yes. Care would need to be taken though because Cython might in > the future very well generate a "while" loop instead for such a > statement under some circumstances, and that won't work with OpenMP. One > should be careful with assuming what the C result wi

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Dag Sverre Seljebotn
Sturla Molden wrote: > On 2/12/2009 12:34 PM, Dag Sverre Seljebotn wrote: > >> FYI, I am one of the core Cython developers and can make such >> modifications in Cython itself as long as there's consensus on how it >> should look on the Cython mailing list. My problem is that I don't >> really

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 5:24 PM, Gael Varoquaux wrote: > My two cents: go for cython objects/statements. Not only does code in > comments looks weird and a hack, but also it means to you have to hack > the parser. I agree with this. Particularly because Cython uses intendation as syntax. With comments you

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Gael Varoquaux
On Thu, Feb 12, 2009 at 03:27:51PM +0100, Sturla Molden wrote: > The question is: Should OpenMP be comments in the Cython code (as they > are in C and Fortran), or should OpenMP be special objects? My two cents: go for cython objects/statements. Not only does code in comments looks weird and a ha

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Michael Abshoff
Nathan Bell wrote: > On Thu, Feb 12, 2009 at 8:19 AM, Michael Abshoff > wrote: Hi, >> No even close. The current generation peaks at around 1.2 TFlops single >> precision, 280 GFlops double precision for ATI's hardware. The main >> problem with those numbers is that the memory on the graphics ca

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 4:03 PM, Matthieu Brucher wrote: > In C89, you will have absolutely no benefit (because there > are no way you can tell the compiler that there is no aliasing), in > Fortran, it will be optimized correctly. In ANSI C (aka C89) the effect is achieved using compiler pragmas. In ISO C

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Nathan Bell
On Thu, Feb 12, 2009 at 8:19 AM, Michael Abshoff wrote: > > No even close. The current generation peaks at around 1.2 TFlops single > precision, 280 GFlops double precision for ATI's hardware. The main > problem with those numbers is that the memory on the graphics card > cannot feed the data fast

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
2009/2/12 David Cournapeau : > Matthieu Brucher wrote: >> >> Sorry, I was refering to my last mail, but I sent so many in 5 minuts ;) >> In C, if you have to arrays (two pointers), the compiler can't make >> aggressive optimizations because they may intersect. With Fortran, >> this is not possible.

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
2009/2/12 David Cournapeau : > Matthieu Brucher wrote: >>> No - I have never seen deep explanation of the matlab model. The C api >>> is so small that it is hard to deduce anything from it (except that the >>> memory handling is not ref-counting-based, I don't know if it matters >>> for our discuss

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Michael Abshoff
David Cournapeau wrote: > Matthieu Brucher wrote: >> For BLAS level 3, the MKL is parallelized (so matrix multiplication is). >> Hi David, > Same for ATLAS: thread support is one focus in the 3.9 serie, currently > in development. ATLAS has had thread support for a long, long time. The 3.9 s

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 1:44 PM, Sturla Molden wrote: Here is an example of SciPy's ckdtree.pyx modified to use OpenMP. It seems I managed to post an errorneous C file. :( S.M. /* * Parallel query for faster kd-tree searches on SMP computers. * This function will relea

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread David Cournapeau
Matthieu Brucher wrote: > > Sorry, I was refering to my last mail, but I sent so many in 5 minuts ;) > In C, if you have to arrays (two pointers), the compiler can't make > aggressive optimizations because they may intersect. With Fortran, > this is not possible. In this matter, Numpy behaves like

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 12:34 PM, Dag Sverre Seljebotn wrote: > FYI, I am one of the core Cython developers and can make such > modifications in Cython itself as long as there's consensus on how it > should look on the Cython mailing list. My problem is that I don't > really know OpenMP and have little e

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread David Cournapeau
Matthieu Brucher wrote: > > For BLAS level 3, the MKL is parallelized (so matrix multiplication is). > Same for ATLAS: thread support is one focus in the 3.9 serie, currently in development. I have never used it, I don't know how it compare to the MKL, David ___

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread David Cournapeau
Matthieu Brucher wrote: >> No - I have never seen deep explanation of the matlab model. The C api >> is so small that it is hard to deduce anything from it (except that the >> memory handling is not ref-counting-based, I don't know if it matters >> for our discussion of speeding up ufunc). I would

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
2009/2/12 Gregor Thalhammer : > Brian Granger schrieb: >>> I am curious: would you know what would be different in numpy's case >>> compared to matlab array model concerning locks ? Matlab, up to >>> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3 >>> (or 7.4), it also uses

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
2009/2/12 Sturla Molden : > On 2/12/2009 1:50 PM, Francesc Alted wrote: > >> Hey! That's very nice to know. We already have OpenMP support in >> Cython for free (or apparently it seems so :-) > > Not we don't, as variable names are different in C and Cython. But > adding support for OpenMP would

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
> No - I have never seen deep explanation of the matlab model. The C api > is so small that it is hard to deduce anything from it (except that the > memory handling is not ref-counting-based, I don't know if it matters > for our discussion of speeding up ufunc). I would guess that since two > array

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
Yes, it is. You have to link against pthread (at least with Linux ;)) You have to write a single parallel region if you don't want this overhead (which is not possible with Python). Matthieu 2009/2/12 Gael Varoquaux : > On Wed, Feb 11, 2009 at 11:52:40PM -0600, Robert Kern wrote: >> > This seem

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Francesc Alted
A Thursday 12 February 2009, Sturla Molden escrigué: > On 2/12/2009 1:50 PM, Francesc Alted wrote: > > Hey! That's very nice to know. We already have OpenMP support in > > Cython for free (or apparently it seems so :-) > > Not we don't, as variable names are different in C and Cython. But > addin

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Matthieu Brucher
> I am curious: would you know what would be different in numpy's case > compared to matlab array model concerning locks ? Matlab, up to > recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3 > (or 7.4), it also uses multicore for mathematical functions (cos, > etc...). So at lea

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Dag Sverre Seljebotn
Sturla Molden wrote: > On 2/12/2009 1:50 PM, Francesc Alted wrote: > > >> Hey! That's very nice to know. We already have OpenMP support in >> Cython for free (or apparently it seems so :-) >> > > Not we don't, as variable names are different in C and Cython. But > adding support for Ope

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Michael Abshoff
Sturla Molden wrote: > On 2/12/2009 12:20 PM, David Cournapeau wrote: Hi, >> It does if you have access to the parallel toolbox I mentioned earlier >> in this thread (again, no experience with it, but I think it is >> specially popular on clusters; in that case, though, it is not limited >> to th

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread David Cournapeau
Francesc Alted wrote: > I don't know OpenMP enough neither, but I'd say that in this list there > could be some people that could help. > > At any rate, I really like the OpenMP approach and prefer to have > support for it in Cython much better than threading, MPI or whatever. > But the thing i

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 1:50 PM, Francesc Alted wrote: > Hey! That's very nice to know. We already have OpenMP support in > Cython for free (or apparently it seems so :-) Not we don't, as variable names are different in C and Cython. But adding support for OpenMP would not bloat the Cython language. Cy

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread David Cournapeau
Sturla Molden wrote: > On 2/12/2009 12:20 PM, David Cournapeau wrote: > > >> It does if you have access to the parallel toolbox I mentioned earlier >> in this thread (again, no experience with it, but I think it is >> specially popular on clusters; in that case, though, it is not limited >> to t

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 12:20 PM, David Cournapeau wrote: > It does if you have access to the parallel toolbox I mentioned earlier > in this thread (again, no experience with it, but I think it is > specially popular on clusters; in that case, though, it is not limited > to thread-based implementation). As

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Francesc Alted
A Thursday 12 February 2009, Sturla Molden escrigué: > OpenMP does not need to be a aprt of the Cython language. It can be > special comments in the code as in Fortran. After all, "#pragma omp > parallel" is a comment in Cython. Hey! That's very nice to know. We already have OpenMP support in C

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Francesc Alted
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué: > FYI, I am one of the core Cython developers and can make such > modifications in Cython itself as long as there's consensus on how it > should look on the Cython mailing list. My problem is that I don't > really know OpenMP and have lit

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 11:30 AM, Dag Sverre Seljebotn wrote: It would be interesting to see how a spec would look for integrating OpenMP natively into Cython for these kinds of purposes. Cython is still flexible as a language after all. Avoiding language bloat is also important, but it is difficult to k

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Sturla Molden
On 2/12/2009 7:15 AM, David Cournapeau wrote: > Since openmp also exists on windows, I doubt that it is required that > openmp uses pthread :) On Windows, MSVC uses Win32 threads and GCC (Cygwin and MinGW) uses pthreads. If you use OpenMP with MinGW, the executable becomes dependent on pthreadG

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread David Cournapeau
Gregor Thalhammer wrote: > Recent Matlab versions use Intels Math Kernel Library, which performs > automatic multi-threading - also for mathematical functions like sin > etc, but not for addition, multiplication etc. It does if you have access to the parallel toolbox I mentioned earlier in this

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Dag Sverre Seljebotn
Francesc Alted wrote: > A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué: > >> A quick digression: >> >> It would be interesting to see how a spec would look for integrating >> OpenMP natively into Cython for these kinds of purposes. Cython is >> still flexible as a language after all.

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Francesc Alted
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué: > A quick digression: > > It would be interesting to see how a spec would look for integrating > OpenMP natively into Cython for these kinds of purposes. Cython is > still flexible as a language after all. That would be really nice indeed

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Gregor Thalhammer
Brian Granger schrieb: >> I am curious: would you know what would be different in numpy's case >> compared to matlab array model concerning locks ? Matlab, up to >> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3 >> (or 7.4), it also uses multicore for mathematical functions

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Dag Sverre Seljebotn
Brian Granger wrote: > Hi, > > This is relevant for anyone who would like to speed up array based > codes using threads. > > I have a simple loop that I have implemented using Cython: > > def backstep(np.ndarray opti, np.ndarray optf, > int istart, int iend, double p, double q): >

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Francesc Alted
Hi Brian, A Thursday 12 February 2009, Brian Granger escrigué: > Hi, > > This is relevant for anyone who would like to speed up array based > codes using threads. > > I have a simple loop that I have implemented using Cython: > > def backstep(np.ndarray opti, np.ndarray optf, > int is

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-12 Thread Gael Varoquaux
On Thu, Feb 12, 2009 at 12:42:37AM -0600, Robert Kern wrote: > It is implemented using threads, with Windows native threads on > Windows. I think Gaël really just meant "threads" there. I guess so :). Once you reformulate my remark in proper terms, this is indeed what comes out. I guess all what

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread David Cournapeau
Brian Granger wrote: >> I am curious: would you know what would be different in numpy's case >> compared to matlab array model concerning locks ? Matlab, up to >> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3 >> (or 7.4), it also uses multicore for mathematical functions (

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Brian Granger
>> Good point. Is it possible to tell what array size it switches over >> to using multiple threads? > > Yes. > > http://svn.scipy.org/svn/numpy/branches/multicore/numpy/core/threadapi.py Sorry, I was curious about what Matlab does in this respect. But, this is very useful and I will look at it.

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Robert Kern
On Thu, Feb 12, 2009 at 00:52, Brian Granger wrote: >> I am curious: would you know what would be different in numpy's case >> compared to matlab array model concerning locks ? Matlab, up to >> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3 >> (or 7.4), it also uses multic

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Brian Granger
> I am curious: would you know what would be different in numpy's case > compared to matlab array model concerning locks ? Matlab, up to > recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3 > (or 7.4), it also uses multicore for mathematical functions (cos, > etc...). So at lea

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Robert Kern
On Thu, Feb 12, 2009 at 00:15, David Cournapeau wrote: > Gael Varoquaux wrote: >> >From a programmer's perspective, because, IMHO, openmp is implemented >> using pthreads. > > Since openmp also exists on windows, I doubt that it is required that > openmp uses pthread :) It is implemented using th

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread David Cournapeau
Gael Varoquaux wrote: > >From a programmer's perspective, because, IMHO, openmp is implemented > using pthreads. Since openmp also exists on windows, I doubt that it is required that openmp uses pthread :) On linux, with gcc, using -fopenmp implies -pthread, so I guess it uses pthread (can you b

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread David Cournapeau
Robert Kern wrote: > > Eric Jones tried to do this with pthreads in C some time ago. His work is > here: > > http://svn.scipy.org/svn/numpy/branches/multicore/ > > The lock overhead makes it usually not worthwhile. > I am curious: would you know what would be different in numpy's case compar

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Gael Varoquaux
On Wed, Feb 11, 2009 at 11:52:40PM -0600, Robert Kern wrote: > > This seem like pretty heavy solutions though. > >From a programmer's perspective, it seems to me like OpenMP is a muck > >lighter weight solution than pthreads. >From a programmer's perspective, because, IMHO, openmp is implemented

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Robert Kern
On Thu, Feb 12, 2009 at 00:03, Brian Granger wrote: >> Eric Jones tried to do this with pthreads in C some time ago. His work is >> here: >> >> http://svn.scipy.org/svn/numpy/branches/multicore/ >> >> The lock overhead makes it usually not worthwhile. > > I was under the impression that Eric's i

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Brian Granger
> Eric Jones tried to do this with pthreads in C some time ago. His work is > here: > > http://svn.scipy.org/svn/numpy/branches/multicore/ > > The lock overhead makes it usually not worthwhile. I was under the impression that Eric's implementation didn't use a thread pool. Thus I thought the bo

Re: [Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Robert Kern
On Wed, Feb 11, 2009 at 23:46, Brian Granger wrote: > Hi, > > This is relevant for anyone who would like to speed up array based > codes using threads. > > I have a simple loop that I have implemented using Cython: > > def backstep(np.ndarray opti, np.ndarray optf, > int istart, int ie

[Numpy-discussion] Fast threading solution thoughts

2009-02-11 Thread Brian Granger
Hi, This is relevant for anyone who would like to speed up array based codes using threads. I have a simple loop that I have implemented using Cython: def backstep(np.ndarray opti, np.ndarray optf, int istart, int iend, double p, double q): cdef int j cdef double *pi cde