Hi, In altivec_expand_vec_perm_const, we look for special masks that match the behavior of specific instructions, so we can use those instructions rather than load a constant control vector and perform a permute. Some of the masks must be treated differently for little endian mode.
The masks that represent merge-high and merge-low operations have reversed meanings in little-endian, because of the reversed ordering of the vector elements. The masks that represent vector-pack operations remain correct when the mode of the input operands matches the natural mode of the instruction, but not otherwise. This is because the pack instructions always select the rightmost, low-order bits of the vector element. There are cases where we use this, for example, with a V8SI vector matching a vpkuwum mask in order to select the odd-numbered elements of the vector. In little endian mode, this instruction will get us the even-numbered elements instead. There is no alternative instruction with the desired behavior, so I've just disabled use of those masks for little endian when the mode isn't natural. These changes fix 32 failures in the test suite for little endian mode. Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new failures. Is this ok for trunk? Thanks, Bill 2013-10-21 Bill Schmidt <wschm...@vnet.ibm.com> * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse meaning of merge-high and merge-low masks for little endian; avoid use of vector-pack masks for little endian for mismatched modes. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 203792) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4]) { 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } }, { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum, { 2, 3, 6, 7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb, { 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh, { 0, 1, 16, 17, 2, 3, 18, 19, 4, 5, 20, 21, 6, 7, 22, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw, { 0, 1, 2, 3, 16, 17, 18, 19, 4, 5, 6, 7, 20, 21, 22, 23 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb, { 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh, { 8, 9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } }, - { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw, + { OPTION_MASK_ALTIVEC, + BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw, { 8, 9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } }, { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew, { 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27 } }, @@ -28980,6 +28987,22 @@ altivec_expand_vec_perm_const (rtx operands[4]) enum machine_mode omode = insn_data[icode].operand[0].mode; enum machine_mode imode = insn_data[icode].operand[1].mode; + /* For little-endian, don't use vpkuwum and vpkuhum if the + underlying vector type is not V4SI and V8HI, respectively. + For example, using vpkuwum with a V8HI picks up the even + halfwords (BE numbering) when the even halfwords (LE + numbering) are what we need. */ + if (!BYTES_BIG_ENDIAN + && icode == CODE_FOR_altivec_vpkuwum + && GET_CODE (op0) == SUBREG + && GET_MODE (XEXP (op0, 0)) != V4SImode) + continue; + if (!BYTES_BIG_ENDIAN + && icode == CODE_FOR_altivec_vpkuhum + && GET_CODE (op0) == SUBREG + && GET_MODE (XEXP (op0, 0)) != V8HImode) + continue; + /* For little-endian, the two input operands must be swapped (or swapped back) to ensure proper right-to-left numbering from 0 to 2N-1. */