Hi
I am trying to trace down a particular issue where a local memory parameter
appears to vanish the 1st time I run a kernel but on every subsequent
invocation works correctly. This is on my Nvidia Geforce GT 555M.
While trying to trace the above issue and switching to my Intel CPU I came
across even more unusual behaviour.
With the following Kernel
__kernel void localmemskip(__global uint *a_g, __global uint *b_g, __local
uint *c_l) {
int i = get_global_id(0);
c_l[i] = a_g[i];
b_g[i] = 0;
if (i > 0)
b_g[i] = c_l[i-1];
}
I invoke the following code:
a = np.array([1,2,3,4,5,6,7,8,9,10], dtype=np.uint32)
b = np.zeros(10, dtype=np.uint32)
mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
b_g = cl.Buffer(ctx, mf.WRITE_ONLY, 4*len(b))
localmemskip(queue, b.shape, None, a_g, b_g, cl.LocalMemory(4*len(b)))
cl.enqueue_copy(queue, b, b_g)
print(b)
Then the output for each platform is:
[0 1 2 3 4 5 6 7 8 9] # <pyopencl.Platform 'NVIDIA CUDA' >
[0 1 2 0 4 0 6 242802248 0 361062976] # <pyopencl.Platform 'Intel(R) OpenCL' >
What's more is each output can start with a different array settling down
to the above after 1st invocation.
I have my suspicions it's something to do with how the local memory is
initialised.
Can anyone see if I am doing something wrong?
Thanks
Blair
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl