jaykang10 added a comment.

> I believe the argument lacks numbers (or at least you have them but didn't 
> mention). I didn't hear about performance results, or validation that this 
> was actually tested for correctness. Small test cases prove a point, but 
> can't be considered general.
> 
> OTOH, it seems like this is exactly why you want the flag, to hide any 
> potential issues and play with it. I'm not opposed to adding the flag if 
> there's a commitment to actually get the result and change the default to 
> whatever seems to be better, does that seems reasonable?

To be honest, I did not start this discussion to get better performance for 
GPU. But the goal has moved while we discussing it. I think I have showed you 
the gain from AMD GPU target on previous comments. On current version of 
clang/llvm, it generates one less load/store to preserve vec3 type instead of 
vec4 type on GPU target. But it could cause more load/store on another targets 
which have vector register because they usually expect aligned vector 
load/store. It means that if we want to change default  to vec3 on clang, we 
need to change llvm's codegen's default legalization or backend targets need to 
be modified in order to get aligned vector load/store with vec3 type. I think 
It is too big change at this moment. In the future, we could move this 
discussion to llvm-dev. Up to that time,  as you mentioned, I would like to 
hide the potential issues with vec3 type and play with it.


https://reviews.llvm.org/D30810



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to