sfertile added inline comments.

================
Comment at: lib/Headers/altivec.h:11908
+#define vec_extract4b(__a, __b)                                                
\
+   vec_reve((vector unsigned long long)                                        
\
+                __builtin_vsx_xxextractuw((__a), (12 - (__b & 0xF))))
----------------
nemanjai wrote:
> I find it difficult to follow and understand this logic when it's in the 
> header.
> What I'd prefer to see here is that the macro simply expands into 
> `__builtin_vsx_xxextractuw` and then handle all this logic in the code that 
> emits an intrinsic call.
> Namely if the target is little endian, we adjust the parameter, emit the 
> intrinsic call and finally emit a shufflevector.
I think this is a good idea, looking at the code its not obvious what is going 
on.


================
Comment at: lib/Headers/altivec.h:12014
+#define vec_insert4b(__a, __b, __c) \
+  ((vector unsigned char)__builtin_vsx_xxinsertw((__a), (__b), (__c) & 0xF))
+#endif
----------------
kbarton wrote:
> nemanjai wrote:
> > As far as I can tell by looking at this patch and the corresponding back 
> > end patch, the `__a` argument will have a word inserted into it and it will 
> > be returned.
> > 
> > Is that the semantics that the ABI specifies (I can't seem to make sense of 
> > the description).
> > 
> > ```
> > vector unsigned int a = { 0xAAAAAAAA, 0xBBBBBB, 0xCCCCCC, 0xDDDDDD };
> > vector unsigned char b = (vector unsigned char) 0xFF;
> > vector unsigned char c = vec_insert4b(a, b, 4);
> > // Do we expect vector c to be:
> > // { 0xAA, 0xAA, 0xAA, 0xAA, 0xFF, 0xFF, 0xFF, 0xFF, 0xCC, 0xCC, 0xCC, 
> > 0xCC, 0xDD, 0xDD, 0xDD, 0xDD }
> > ```
> I think the current version of the ABI document has an error in it. The 
> description of the vec_insert4b is identical to the vec_extract4b, so I 
> expect it was copy/pasted in error. I think we need to open up an (internal) 
> bug against the ABI and wait for clarification to complete this. 
You have it correct Nemanja, word 1 will be extracted from b, and it will get 
inserted into a. The word will be inserted at the byte position starting at the 
3rd argument. (so in this case byte offsets 4 to 7)

I talked to Bill Schmidt earlier today and he already has a bug open. It hasn't 
been updated yet, but it should roughly correspond to the description for the 
xxinsertw instruction:

The contents of word element 1 of VSR[XB] are placed
into byte elements UIM:UIM+3 of VSR[XT]. The contents
of the remaining byte elements of VSR[XT] are not
modified.


Repository:
  rL LLVM

https://reviews.llvm.org/D26546



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to