https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #8 from luoxhu at gcc dot gnu.org --- (In reply to Jens Seifert from comment #7) > Regarding vec_revb for vector unsigned int. I agree that > revb: > .LFB0: > .cfi_startproc > vspltish %v1,8 > vspltisw %v0,-16 > vrlh %v2,%v2,%v1 > vrlw %v2,%v2,%v0 > blr > > works. But in this case, I would prefer the vperm approach assuming that the > loaded constant for the permute vector can be re-used multiple times. > But please get rid of the xxlnor 32,32,32. That does not make sense after > loading a constant. Change the constant that need to be loaded. xxlnor is LE specific requirement(not existed if build with -mbig), we need to turn the index {0,1,2,3} to {31, 30,29,28} for vperm usage, it is required otherwise produces incorrect result: 6| 0x0000000010000630 <+16>: lvx v0,0,r9 7+> 0x0000000010000634 <+20>: xxlnor vs32,vs32,vs32 8| 0x0000000010000638 <+24>: vperm v2,v2,v2,v0 9| 0x000000001000063c <+28>: blr (gdb) 0x0000000010000634 in revb () 2: /x $vs34.uint128 = 0x42345678323456782234567812345678 5: /x $vs32.uint128 = 0xc0d0e0f08090a0b0405060700010203 (gdb) si 0x0000000010000638 in revb () 2: /x $vs34.uint128 = 0x42345678323456782234567812345678 5: /x $vs32.uint128 = 0xf3f2f1f0f7f6f5f4fbfaf9f8fffefdfc (gdb) si 0x000000001000063c in revb () 2: /x $vs34.uint128 = 0x78563442785634327856342278563412 5: /x $vs32.uint128 = 0xf3f2f1f0f7f6f5f4fbfaf9f8fffefdfc Quoted from the ISA: vperm VRT,VRA,VRB,VRC vsrc.qword[0] ← VSR[VRA+32] vsrc.qword[1] ← VSR[VRB+32] do i = 0 to 15 index ← VSR[VRC+32].byte[i].bit[3:7] VSR[VRT+32].byte[i] ← src.byte[index] end Let the source vector be the concatenation of the contents of VSR[VRA+32] followed by the contents of VSR[VRB+32]. For each integer value i from 0 to 15, do the following. Let index be the value specified by bits 3:7 of byte element i of VSR[VRC+32]. The contents of byte element index of src are placed into byte element i of VSR[VRT+32].