sfertile added inline comments.
================ Comment at: lib/Headers/altivec.h:11908 +#define vec_extract4b(__a, __b) \ + vec_reve((vector unsigned long long) \ + __builtin_vsx_xxextractuw((__a), (12 - (__b & 0xF)))) ---------------- nemanjai wrote: > I find it difficult to follow and understand this logic when it's in the > header. > What I'd prefer to see here is that the macro simply expands into > `__builtin_vsx_xxextractuw` and then handle all this logic in the code that > emits an intrinsic call. > Namely if the target is little endian, we adjust the parameter, emit the > intrinsic call and finally emit a shufflevector. I think this is a good idea, looking at the code its not obvious what is going on. ================ Comment at: lib/Headers/altivec.h:12014 +#define vec_insert4b(__a, __b, __c) \ + ((vector unsigned char)__builtin_vsx_xxinsertw((__a), (__b), (__c) & 0xF)) +#endif ---------------- kbarton wrote: > nemanjai wrote: > > As far as I can tell by looking at this patch and the corresponding back > > end patch, the `__a` argument will have a word inserted into it and it will > > be returned. > > > > Is that the semantics that the ABI specifies (I can't seem to make sense of > > the description). > > > > ``` > > vector unsigned int a = { 0xAAAAAAAA, 0xBBBBBB, 0xCCCCCC, 0xDDDDDD }; > > vector unsigned char b = (vector unsigned char) 0xFF; > > vector unsigned char c = vec_insert4b(a, b, 4); > > // Do we expect vector c to be: > > // { 0xAA, 0xAA, 0xAA, 0xAA, 0xFF, 0xFF, 0xFF, 0xFF, 0xCC, 0xCC, 0xCC, > > 0xCC, 0xDD, 0xDD, 0xDD, 0xDD } > > ``` > I think the current version of the ABI document has an error in it. The > description of the vec_insert4b is identical to the vec_extract4b, so I > expect it was copy/pasted in error. I think we need to open up an (internal) > bug against the ABI and wait for clarification to complete this. You have it correct Nemanja, word 1 will be extracted from b, and it will get inserted into a. The word will be inserted at the byte position starting at the 3rd argument. (so in this case byte offsets 4 to 7) I talked to Bill Schmidt earlier today and he already has a bug open. It hasn't been updated yet, but it should roughly correspond to the description for the xxinsertw instruction: The contents of word element 1 of VSR[XB] are placed into byte elements UIM:UIM+3 of VSR[XT]. The contents of the remaining byte elements of VSR[XT] are not modified. Repository: rL LLVM https://reviews.llvm.org/D26546 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits