Henry Gomersall <[email protected]> writes:
> I've noticed that using e.g. clmath._atan2(out, in1, in2, queue) with a 
> pre-allocated `out` array is nearly twice as fast as using 
> clmath.atan2(in1, in2, queue), even when a memory pool is used to 
> allocate the Array.

Oops. Thanks for reporting this.

It turns out that the Python reimplementation of the memory pool was
pretty broken--it didn't actually do much. With the fixed version now in
git, things look considerably better. In particular, when I try your
test code (on the AMD CPU implementation), the time difference between
the explicit out argument and the mempool version is now only about 10%.

> Is _atan2 part of the stable API?

Nope.

Andreas

Attachment: signature.asc
Description: PGP signature

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to