Hi,

in case you have not heard: I'm currently working on the PPC and S390X
port for micro numpy. Thanks to IBM for funding this work.

I'm ~50% through the ppc operations to implement. The goal is to turn
this optimization on (by default) in the micro numpy module.

I recently had the idea to enhance the jit driver by giving it more
information about parallel execution. I'm *not* talking about the main
interp. loop. Having a vectorized loop that executes parallel in threads
would certainly push micronumpy performance.

Has somebody already tried something similar? I think it is a challenge,
but it should be possible (with a reasonable amount of work) to get a
simple thread fork/join model such as OpenMP provides.

Cheers,
Richard

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to