We're getting an invalid command queue error from the following code.
I've tried to reduce it as much as possible to the minimal test case,
and I think that this is as small as it gets.
We (my coworker and I) are both running on Macbook Pro, OSX 10.7, with
2011.1 or 2011.2.
<pyopencl.Device 'GeForce GT 330M' at 0x1022600>
<pyopencl.Context at 0x103b00bf8 on <pyopencl.Device 'GeForce GT 330M'
at 0x1022600>>
The thing that's stumping us is that if we make some very minor
changes (like "change all of the uints to ints") then it stops
crashing. The things that we've seen stop the error are:
- replace all uint with int.
- remove any of the three levels of for loop; replace with a line like
int v=get_global_id(1); outside the loops.
- remove the while loop.
- hard code max_index to 128.
I'm sure that some of the "change this and the error goes away" items
are the result of things being optimized away, but it's not clear
exactly which those are.
Sadly, none of the things we've been trying are applicable to our
actual code (or they don't resolve the issue).
Any suggestions for trying to figure out what's going on? We are
quite stumped. If there's a better list or forum to use, we'd be
happy to move the discussion there (the dev forums for the OSX Lion
beta seem pretty dead; not sure where else is a good place to go for
help).
Thanks,
Eli
--
# python unit test
def test_60_opencl_crash_minimal(self):
import os
import pyopencl as cl
mf = cl.mem_flags
gpu_device = cl.get_platforms()[0].get_devices(cl.device_type.GPU)[0]
max_work_item_sizes =
gpu_device.get_info(cl.device_info.MAX_WORK_ITEM_SIZES)
ctx = cl.Context(devices=[gpu_device])
queue = cl.CommandQueue(ctx)
# Compile OpenCL code
code = '''
#include "mms/common/entity/plan.cl"
//''' + str(datetime.datetime.now())
prg = cl.Program(ctx, code).build(options='-g
-I{}'.format(os.path.abspath(os.path.dirname(os.path.dirname(mms.__file__)))))
data_farray = numpy.zeros((128, 128, 128), dtype=numpy.float32)
data_buf = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR |
mf.ALLOC_HOST_PTR, hostbuf=data_farray)
data_shape_ndary = numpy.array(data_farray.shape, dtype=numpy.uint32)
data_shape_buf = cl.Buffer(ctx, mf.READ_ONLY |
mf.COPY_HOST_PTR | mf.ALLOC_HOST_PTR, hostbuf=data_shape_ndary)
globalsize_tup = (1,1) # XXX DEBUG
localsize_tup = None
#evt =
prg.crashing(queue, globalsize_tup, localsize_tup,
data_buf,
data_shape_buf,
).wait()
cl.enqueue_read_buffer(queue, data_buf, data_farray).wait()
// opencl code
__kernel void
crashing(
__global float *data,
__global const uint *data_shape
) {
const uint max_index = data_shape[0];
const uint max_row = data_shape[1];
const uint max_col = data_shape[2];
int4 myVector;
int max_u, max_v, max_slab;
max_slab = max_index;
max_u = max_row;
max_v = max_col;
for (int slab=0; slab < max_slab; slab++) {
for (int u=get_global_id(0); u < max_u; u += get_global_size(0)) {
for (int v=get_global_id(1); v < max_v; v += get_global_size(1)) {
myVector = (int4)(slab, 0, 0, 0);
int counter = 0;
while (counter < 1) {
if (myVector.s0 < max_index) {
counter = 999;
}
counter++;
}
}
}
}
}
_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl