Hi,

rahul garg wrote:
> I was thinking of providing a "prange" which defaults to xrange when running
> on interpreter.
> The reason I like the prange construct, is that we can easily add lets say
> thread-local variables or reduction variables.
> for i in prange(i, threadlocal=[myvar1,myvar2],reduction=[red1,red2]): #loop
> body

You could easily do that with a

    with thread_each(iterable, threadlocal=...):
        ...

syntax, too, and IMHO it looks much better (minus a better name for
"thread_each" ;)

And it might even be possible to support this in plain Python one day.


>>> For an implementation perspectives, it will be a little challenging to
>>> generate code while avoiding the GIL but it can be done in simple cases.
>>
>> It may even be enough to require "nogil" for parallel execution in the
>> beginning. I expect the GIL overhead to be too high for simple loops and I
>> don't think you'd write bigger things as a loop. When I use threads in
>> Python,
>> I'm usually quite considerate about the minimum amount of work that gets
>> parallelised.
>>
> 
> Can you clarify this? Or give an example?

I was just thinking of a couple of places in lxml. When we do a whole parser
run in C code, it's reasonable to free the GIL. However, when we do loads of
callbacks into Python SAX code, the overhead of releasing and acquiring the
GIL is just so tremendously huge compared to the short GIL-free parser steps
that it becomes much slower regardless of the number of threads.

So if you map the whole thing to threads running in a single interpreter and
do a lot of Python interaction in each thread, you will likely loose a lot of
performance. It's only really worth it if you do not need the GIL at all and
can release it before starting up the parallel code.

Gary's comment on the new multiprocessing module in Py3 shows a nice
alternative, but starting a new process isn't really for free either (even if
a fork is pretty fast on most platforms except those from a well-known
commercial OS vendor).

I think there are basically two use cases for such a feature:

1) syntactic sugar for starting up threads or processes and letting them do
equal stuff (possibly calling a larger function in the parallel block)

2) somewhat short Python-free code snippets that can run in parallel without
the GIL.

Both should be handled differently by the compiler, possible distinguished
automatically depending on the GIL being held or not. So I would say that 1)
comprises the single provider, multiple consumers case that I mentioned earlier.

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to