[PyOpenCL] clmath with pre-allocated arrays.

Henry Gomersall Tue, 11 Aug 2015 04:38:50 -0700

I've noticed that using e.g. clmath._atan2(out, in1, in2, queue) with apre-allocated `out` array is nearly twice as fast as usingclmath.atan2(in1, in2, queue), even when a memory pool is used toallocate the Array.


Consider the (simple) code here:
https://gist.github.com/hgomersall/d7a229df0f816388b63f

It defines the two test cases above inside a function which can be runinside ipython as follows:


In [1]: from clmath_test import *

In [2]: timeit cl_test()
1000 loops, best of 3: 639 µs per loop

In [3]: timeit cl_test_preallocated()
1000 loops, best of 3: 363 µs per loop

Am I missing something here or is this expected behaviour?

Is _atan2 part of the stable API?

(this was on an nvidia machine. On my intel laptop, I seem to run intothis bug:

https://bugs.launchpad.net/ubuntu/+source/pyopencl/+bug/1354086)

Cheers,

Henry

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

[PyOpenCL] clmath with pre-allocated arrays.

Reply via email to