Anastasia added a comment.

I don't think there is anything wrong with the generation of vec3->vec4 in 
Clang. I believe the motivation for this was the OpenCL spec treating vec3 as 
vec4 aligned type (see section 6.1.5: 
https://www.khronos.org/registry/OpenCL/specs/opencl-2.0-openclc.pdf#12). So in 
terms of memory layout vec3 wouldn't be any different to vec4. But in terms of 
operations (including loads/stores) there can be potential gain from not 
handling the 4th element.  This can be exploited by some targets. I think 
generating the vec3 from the frontend would be a better choice in the first 
place. Because backend can decide how to handle this. Including for 
architectures with no SIMD support it would just generate 3 separate loads. 
Right now it seems that it will be forced to generate 4 loads.

But considering that transformation to vec4 has been the default implementation 
for quite a while in the frontend, I think we would need a stronger motivation 
for switching to original vec3. So current approach with a special flag for 
preserving vec3 should be good enough to fit all needs.


https://reviews.llvm.org/D30810



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to