Henry Gomersall <[email protected]> writes: > I've noticed that using e.g. clmath._atan2(out, in1, in2, queue) with a > pre-allocated `out` array is nearly twice as fast as using > clmath.atan2(in1, in2, queue), even when a memory pool is used to > allocate the Array.
Oops. Thanks for reporting this. It turns out that the Python reimplementation of the memory pool was pretty broken--it didn't actually do much. With the fixed version now in git, things look considerably better. In particular, when I try your test code (on the AMD CPU implementation), the time difference between the explicit out argument and the mempool version is now only about 10%. > Is _atan2 part of the stable API? Nope. Andreas
signature.asc
Description: PGP signature
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
