> Sturla Molden wrote:
> IMO there's a problem with using literal variable names here, because
> Python syntax implies that the value is passed. One shouldn't make
> syntax where private=(i,) is legal but private=(f(),) isn't.
The latter would be illegal in OpenMP as well. OpenMP pragmas only tak
> You know, I thought of the exact same thing when reading your post. No,
> you need the GIL currently, but that's something I'd like to fix.
>
> Ideally, it would be something like this:
>
> cdef int i, s = 0, n = ...
> cdef np.ndarray[int] arr = ... # will require the GIL
> with nogil:
>for i
Brian Granger wrote:
> And a question:
>
> With the new Numpy support in Cython, does Cython release the GIL if
> it can when running through through loops over numpy arrays? Does
> Cython call into the C API during these sections?
You know, I thought of the exact same thing when reading your pos
Wow, interesting thread. Thanks everyone for the ideas. A few more comments:
GPUs/CUDA:
* Even though there is a bottleneck between main memory and GPU
memory, as Nathan mentioned, the much larger memory bandwidth on a GPU
often makes GPUs great for memory bound computations...as long as you
ca
> At any rate, I really like the OpenMP approach and prefer to have
> support for it in Cython much better than threading, MPI or whatever.
> But the thing is: is OpenMP stable, mature enough for allow using it in
> most of common platforms? I think that recent GCC compilers support
> the latest i
> Recent Matlab versions use Intels Math Kernel Library, which performs
> automatic multi-threading - also for mathematical functions like sin
> etc, but not for addition, multiplication etc. It seems to me Matlab
> itself does not take care of multi-threading. On
> http://www.intel.com/software/p
> If your problem is evaluating vector expressions just like the above
> (i.e. without using transcendental functions like sin, exp, etc...),
> usually the bottleneck is on memory access, so using several threads is
> simply not going to help you achieving better performance, but rather
> the contr
Dag Sverre Seljebotn wrote:
> Hmm... yes. Care would need to be taken though because Cython might in
> the future very well generate a "while" loop instead for such a
> statement under some circumstances, and that won't work with OpenMP. One
> should be careful with assuming what the C result wi
Sturla Molden wrote:
> On 2/12/2009 12:34 PM, Dag Sverre Seljebotn wrote:
>
>> FYI, I am one of the core Cython developers and can make such
>> modifications in Cython itself as long as there's consensus on how it
>> should look on the Cython mailing list. My problem is that I don't
>> really
On 2/12/2009 5:24 PM, Gael Varoquaux wrote:
> My two cents: go for cython objects/statements. Not only does code in
> comments looks weird and a hack, but also it means to you have to hack
> the parser.
I agree with this. Particularly because Cython uses intendation as
syntax. With comments you
On Thu, Feb 12, 2009 at 03:27:51PM +0100, Sturla Molden wrote:
> The question is: Should OpenMP be comments in the Cython code (as they
> are in C and Fortran), or should OpenMP be special objects?
My two cents: go for cython objects/statements. Not only does code in
comments looks weird and a ha
Nathan Bell wrote:
> On Thu, Feb 12, 2009 at 8:19 AM, Michael Abshoff
> wrote:
Hi,
>> No even close. The current generation peaks at around 1.2 TFlops single
>> precision, 280 GFlops double precision for ATI's hardware. The main
>> problem with those numbers is that the memory on the graphics ca
On 2/12/2009 4:03 PM, Matthieu Brucher wrote:
> In C89, you will have absolutely no benefit (because there
> are no way you can tell the compiler that there is no aliasing), in
> Fortran, it will be optimized correctly.
In ANSI C (aka C89) the effect is achieved using compiler pragmas.
In ISO C
On Thu, Feb 12, 2009 at 8:19 AM, Michael Abshoff
wrote:
>
> No even close. The current generation peaks at around 1.2 TFlops single
> precision, 280 GFlops double precision for ATI's hardware. The main
> problem with those numbers is that the memory on the graphics card
> cannot feed the data fast
2009/2/12 David Cournapeau :
> Matthieu Brucher wrote:
>>
>> Sorry, I was refering to my last mail, but I sent so many in 5 minuts ;)
>> In C, if you have to arrays (two pointers), the compiler can't make
>> aggressive optimizations because they may intersect. With Fortran,
>> this is not possible.
2009/2/12 David Cournapeau :
> Matthieu Brucher wrote:
>>> No - I have never seen deep explanation of the matlab model. The C api
>>> is so small that it is hard to deduce anything from it (except that the
>>> memory handling is not ref-counting-based, I don't know if it matters
>>> for our discuss
David Cournapeau wrote:
> Matthieu Brucher wrote:
>> For BLAS level 3, the MKL is parallelized (so matrix multiplication is).
>>
Hi David,
> Same for ATLAS: thread support is one focus in the 3.9 serie, currently
> in development.
ATLAS has had thread support for a long, long time. The 3.9 s
On 2/12/2009 1:44 PM, Sturla Molden wrote:
Here is an example of SciPy's ckdtree.pyx modified to use OpenMP.
It seems I managed to post an errorneous C file. :(
S.M.
/*
* Parallel query for faster kd-tree searches on SMP computers.
* This function will relea
Matthieu Brucher wrote:
>
> Sorry, I was refering to my last mail, but I sent so many in 5 minuts ;)
> In C, if you have to arrays (two pointers), the compiler can't make
> aggressive optimizations because they may intersect. With Fortran,
> this is not possible. In this matter, Numpy behaves like
On 2/12/2009 12:34 PM, Dag Sverre Seljebotn wrote:
> FYI, I am one of the core Cython developers and can make such
> modifications in Cython itself as long as there's consensus on how it
> should look on the Cython mailing list. My problem is that I don't
> really know OpenMP and have little e
Matthieu Brucher wrote:
>
> For BLAS level 3, the MKL is parallelized (so matrix multiplication is).
>
Same for ATLAS: thread support is one focus in the 3.9 serie, currently
in development. I have never used it, I don't know how it compare to the
MKL,
David
___
Matthieu Brucher wrote:
>> No - I have never seen deep explanation of the matlab model. The C api
>> is so small that it is hard to deduce anything from it (except that the
>> memory handling is not ref-counting-based, I don't know if it matters
>> for our discussion of speeding up ufunc). I would
2009/2/12 Gregor Thalhammer :
> Brian Granger schrieb:
>>> I am curious: would you know what would be different in numpy's case
>>> compared to matlab array model concerning locks ? Matlab, up to
>>> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
>>> (or 7.4), it also uses
2009/2/12 Sturla Molden :
> On 2/12/2009 1:50 PM, Francesc Alted wrote:
>
>> Hey! That's very nice to know. We already have OpenMP support in
>> Cython for free (or apparently it seems so :-)
>
> Not we don't, as variable names are different in C and Cython. But
> adding support for OpenMP would
> No - I have never seen deep explanation of the matlab model. The C api
> is so small that it is hard to deduce anything from it (except that the
> memory handling is not ref-counting-based, I don't know if it matters
> for our discussion of speeding up ufunc). I would guess that since two
> array
Yes, it is. You have to link against pthread (at least with Linux ;))
You have to write a single parallel region if you don't want this
overhead (which is not possible with Python).
Matthieu
2009/2/12 Gael Varoquaux :
> On Wed, Feb 11, 2009 at 11:52:40PM -0600, Robert Kern wrote:
>> > This seem
A Thursday 12 February 2009, Sturla Molden escrigué:
> On 2/12/2009 1:50 PM, Francesc Alted wrote:
> > Hey! That's very nice to know. We already have OpenMP support in
> > Cython for free (or apparently it seems so :-)
>
> Not we don't, as variable names are different in C and Cython. But
> addin
> I am curious: would you know what would be different in numpy's case
> compared to matlab array model concerning locks ? Matlab, up to
> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
> (or 7.4), it also uses multicore for mathematical functions (cos,
> etc...). So at lea
Sturla Molden wrote:
> On 2/12/2009 1:50 PM, Francesc Alted wrote:
>
>
>> Hey! That's very nice to know. We already have OpenMP support in
>> Cython for free (or apparently it seems so :-)
>>
>
> Not we don't, as variable names are different in C and Cython. But
> adding support for Ope
Sturla Molden wrote:
> On 2/12/2009 12:20 PM, David Cournapeau wrote:
Hi,
>> It does if you have access to the parallel toolbox I mentioned earlier
>> in this thread (again, no experience with it, but I think it is
>> specially popular on clusters; in that case, though, it is not limited
>> to th
Francesc Alted wrote:
> I don't know OpenMP enough neither, but I'd say that in this list there
> could be some people that could help.
>
> At any rate, I really like the OpenMP approach and prefer to have
> support for it in Cython much better than threading, MPI or whatever.
> But the thing i
On 2/12/2009 1:50 PM, Francesc Alted wrote:
> Hey! That's very nice to know. We already have OpenMP support in
> Cython for free (or apparently it seems so :-)
Not we don't, as variable names are different in C and Cython. But
adding support for OpenMP would not bloat the Cython language.
Cy
Sturla Molden wrote:
> On 2/12/2009 12:20 PM, David Cournapeau wrote:
>
>
>> It does if you have access to the parallel toolbox I mentioned earlier
>> in this thread (again, no experience with it, but I think it is
>> specially popular on clusters; in that case, though, it is not limited
>> to t
On 2/12/2009 12:20 PM, David Cournapeau wrote:
> It does if you have access to the parallel toolbox I mentioned earlier
> in this thread (again, no experience with it, but I think it is
> specially popular on clusters; in that case, though, it is not limited
> to thread-based implementation).
As
A Thursday 12 February 2009, Sturla Molden escrigué:
> OpenMP does not need to be a aprt of the Cython language. It can be
> special comments in the code as in Fortran. After all, "#pragma omp
> parallel" is a comment in Cython.
Hey! That's very nice to know. We already have OpenMP support in
C
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué:
> FYI, I am one of the core Cython developers and can make such
> modifications in Cython itself as long as there's consensus on how it
> should look on the Cython mailing list. My problem is that I don't
> really know OpenMP and have lit
On 2/12/2009 11:30 AM, Dag Sverre Seljebotn wrote:
It would be interesting to see how a spec would look for integrating
OpenMP natively into Cython for these kinds of purposes. Cython is still
flexible as a language after all. Avoiding language bloat is also
important, but it is difficult to k
On 2/12/2009 7:15 AM, David Cournapeau wrote:
> Since openmp also exists on windows, I doubt that it is required that
> openmp uses pthread :)
On Windows, MSVC uses Win32 threads and GCC (Cygwin and MinGW) uses
pthreads. If you use OpenMP with MinGW, the executable becomes dependent
on pthreadG
Gregor Thalhammer wrote:
> Recent Matlab versions use Intels Math Kernel Library, which performs
> automatic multi-threading - also for mathematical functions like sin
> etc, but not for addition, multiplication etc.
It does if you have access to the parallel toolbox I mentioned earlier
in this
Francesc Alted wrote:
> A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué:
>
>> A quick digression:
>>
>> It would be interesting to see how a spec would look for integrating
>> OpenMP natively into Cython for these kinds of purposes. Cython is
>> still flexible as a language after all.
A Thursday 12 February 2009, Dag Sverre Seljebotn escrigué:
> A quick digression:
>
> It would be interesting to see how a spec would look for integrating
> OpenMP natively into Cython for these kinds of purposes. Cython is
> still flexible as a language after all.
That would be really nice indeed
Brian Granger schrieb:
>> I am curious: would you know what would be different in numpy's case
>> compared to matlab array model concerning locks ? Matlab, up to
>> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
>> (or 7.4), it also uses multicore for mathematical functions
Brian Granger wrote:
> Hi,
>
> This is relevant for anyone who would like to speed up array based
> codes using threads.
>
> I have a simple loop that I have implemented using Cython:
>
> def backstep(np.ndarray opti, np.ndarray optf,
> int istart, int iend, double p, double q):
>
Hi Brian,
A Thursday 12 February 2009, Brian Granger escrigué:
> Hi,
>
> This is relevant for anyone who would like to speed up array based
> codes using threads.
>
> I have a simple loop that I have implemented using Cython:
>
> def backstep(np.ndarray opti, np.ndarray optf,
> int is
On Thu, Feb 12, 2009 at 12:42:37AM -0600, Robert Kern wrote:
> It is implemented using threads, with Windows native threads on
> Windows. I think Gaël really just meant "threads" there.
I guess so :). Once you reformulate my remark in proper terms, this is
indeed what comes out.
I guess all what
Brian Granger wrote:
>> I am curious: would you know what would be different in numpy's case
>> compared to matlab array model concerning locks ? Matlab, up to
>> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
>> (or 7.4), it also uses multicore for mathematical functions (
>> Good point. Is it possible to tell what array size it switches over
>> to using multiple threads?
>
> Yes.
>
> http://svn.scipy.org/svn/numpy/branches/multicore/numpy/core/threadapi.py
Sorry, I was curious about what Matlab does in this respect. But,
this is very useful and I will look at it.
On Thu, Feb 12, 2009 at 00:52, Brian Granger wrote:
>> I am curious: would you know what would be different in numpy's case
>> compared to matlab array model concerning locks ? Matlab, up to
>> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
>> (or 7.4), it also uses multic
> I am curious: would you know what would be different in numpy's case
> compared to matlab array model concerning locks ? Matlab, up to
> recently, only spreads BLAS/LAPACK on multi-cores, but since matlab 7.3
> (or 7.4), it also uses multicore for mathematical functions (cos,
> etc...). So at lea
On Thu, Feb 12, 2009 at 00:15, David Cournapeau
wrote:
> Gael Varoquaux wrote:
>> >From a programmer's perspective, because, IMHO, openmp is implemented
>> using pthreads.
>
> Since openmp also exists on windows, I doubt that it is required that
> openmp uses pthread :)
It is implemented using th
Gael Varoquaux wrote:
> >From a programmer's perspective, because, IMHO, openmp is implemented
> using pthreads.
Since openmp also exists on windows, I doubt that it is required that
openmp uses pthread :)
On linux, with gcc, using -fopenmp implies -pthread, so I guess it uses
pthread (can you b
Robert Kern wrote:
>
> Eric Jones tried to do this with pthreads in C some time ago. His work is
> here:
>
> http://svn.scipy.org/svn/numpy/branches/multicore/
>
> The lock overhead makes it usually not worthwhile.
>
I am curious: would you know what would be different in numpy's case
compar
On Wed, Feb 11, 2009 at 11:52:40PM -0600, Robert Kern wrote:
> > This seem like pretty heavy solutions though.
> >From a programmer's perspective, it seems to me like OpenMP is a muck
> >lighter weight solution than pthreads.
>From a programmer's perspective, because, IMHO, openmp is implemented
On Thu, Feb 12, 2009 at 00:03, Brian Granger wrote:
>> Eric Jones tried to do this with pthreads in C some time ago. His work is
>> here:
>>
>> http://svn.scipy.org/svn/numpy/branches/multicore/
>>
>> The lock overhead makes it usually not worthwhile.
>
> I was under the impression that Eric's i
> Eric Jones tried to do this with pthreads in C some time ago. His work is
> here:
>
> http://svn.scipy.org/svn/numpy/branches/multicore/
>
> The lock overhead makes it usually not worthwhile.
I was under the impression that Eric's implementation didn't use a
thread pool. Thus I thought the bo
On Wed, Feb 11, 2009 at 23:46, Brian Granger wrote:
> Hi,
>
> This is relevant for anyone who would like to speed up array based
> codes using threads.
>
> I have a simple loop that I have implemented using Cython:
>
> def backstep(np.ndarray opti, np.ndarray optf,
> int istart, int ie
Hi,
This is relevant for anyone who would like to speed up array based
codes using threads.
I have a simple loop that I have implemented using Cython:
def backstep(np.ndarray opti, np.ndarray optf,
int istart, int iend, double p, double q):
cdef int j
cdef double *pi
cde
57 matches
Mail list logo