Bogdan,

Thank you for your answer. I tried your proposition, and, as you say,
it returns 16 bytes instead of 12.

Strange bug in Apple's implementation...

Thanks for your help.

David.

2010/12/13 Bogdan Opanchuk <[email protected]>:
> Hi David,
>
> On my macbook pro with SL 10.6.5, I get segfault for _GPU_, while CPU works 
> ok.
> Try to replace the last line in your kernel with
>
> value[gid] = (float)((int)(points + 1) - (int)(points));
>
> which will show you the "actual" size of float3 (the alignment is set
> somewhere in float3 definition, I guess, because sizeof(float3) gives
> 12, as expected). I get 16 bytes both on CPU and GPU, and GPU just
> seems to be more sensitive to buffer overflows, which results in
> segfault.
>
> Best regards,
> Bogdan
>
> On Fri, Dec 10, 2010 at 2:01 AM, David Libault <[email protected]> 
> wrote:
>> Hi,
>>
>> Following is the code that has a strange behavior. Works on GPU
>> (device 0 on my macbook pro), and segfaults on CPU (device 1). I
>> wanted to try geometric functions like "distance", only implemented on
>> the CPU.
>> Any remark regarding this dummy code will be kindly appreciated...
>>
>> import numpy
>> import pyopencl as cl
>>
>> platform = cl.get_platforms()[0]
>> device = platform.get_devices()[1]
>> print device
>> ctx = cl.Context([device])
>> queue = cl.CommandQueue(ctx)
>>
>>
>> value_array = numpy.zeros(1000, dtype=numpy.float32)
>> #points_array = numpy.zeros((1000, 3), dtype=numpy.float32)
>> points_array = numpy.random.rand(3000).astype(numpy.float32)
>> print points_array[0]
>>
>> mf = cl.mem_flags
>> points_buffer = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR,
>> hostbuf = points_array)
>> value_buffer = cl.Buffer(ctx, mf.READ_WRITE, value_array.nbytes)
>>
>> prg = cl.Program(ctx,
>>    """
>>    __kernel void Test(__global float3 const *points, __global float *value)
>>    {
>>        int gid = get_global_id(0);
>>
>>        value[gid] = points[gid].x ;
>>    }
>>    """).build()
>>
>> print "points_array shape : ",points_array.shape
>> prg.Test(queue, value_array.shape, None, points_buffer, value_buffer)
>> cl.enqueue_read_buffer(queue, value_buffer, value_array).wait()
>> print "%f"%value_array[0]
>>
>> Regards,
>>
>> David.
>>
>> _______________________________________________
>> PyOpenCL mailing list
>> [email protected]
>> http://lists.tiker.net/listinfo/pyopencl
>>
>

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to