Bogdan, After further checking, it does not look like a design decision, but a strict implementation of the standard ! See opencl1.1 page 163 :
"6.1.5 Alignment of Types For 3-component vector data types, the size of the data type is 4 * sizeof(component). This means that a 3-component vector data type will be aligned to a 4 * sizeof(component) boundary. The vload3 and vstore3 built-in functions can be used to read and write, respectively, 3-component vector data types from an array of packed scalar data type." So Apple's implementation looks correct... Andreas, What implementation of opencl are you using on your 64-bit Linux, as it might have an alignment issue ? Could you please try the code proposed by Bogdan to check float3 size ? Thanks. Regards, David. 2010/12/14 Bogdan Opanchuk <[email protected]>: > Hello David, > > On Tue, Dec 14, 2010 at 2:54 AM, David Libault <[email protected]> > wrote: >> Thank you for your answer. I tried your proposition, and, as you say, >> it returns 16 bytes instead of 12. >> >> Strange bug in Apple's implementation... > > I would not call that a bug; most probably they decided that improved > float3 fetching/storing speed (16 bytes can be transferred using the > single instruction and properly coalesced, as opposed to 12 bytes) is > worth increased storage space. If you were using "new float3[...]" to > allocate buffer, it would pass unnoticed, but when you allocated > memory in Python, you bumped into this design decision. > > Best regards, > Bogdan > _______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
