[PATCH, ARM] Constant vector permute for the Neon vext insn
Hi, The patch below enables GCC for ARM to implement relevant constant vector permutations using the Neon vext instruction, by extending the support currently in place for vrev, vzip, vunzip and vtrn. For the cases where vext and vrev would lead to the same result, I have chosen to keep using vrev to avoid updating the testsuite when both are equivalent (1 cycle) or when vrev is faster (1 cycle when operating on Qn vs 2 cycles for vext). Tested with qemu-arm on arm-none-linux-gnueabi. Christophe. 2012-08-23 Christophe Lyon gcc/ * config/arm/arm.c (arm_evpc_neon_vext): New function. (arm_expand_vec_perm_const_1): Add call to arm_evpc_neon_vext. gcc/testsuite/ * gcc.target/arm/neon-vext.c: New tests. gcc-vec-permute-vext.patch Description: Binary data
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 24/08/12 08:45, Christophe Lyon wrote: > Hi, > > The patch below enables GCC for ARM to implement relevant constant > vector permutations using the Neon vext instruction, by extending the > support currently in place for vrev, vzip, vunzip and vtrn. > > For the cases where vext and vrev would lead to the same result, I > have chosen to keep using vrev to avoid updating the testsuite when > both are equivalent (1 cycle) or when vrev is faster (1 cycle when > operating on Qn vs 2 cycles for vext). > > Tested with qemu-arm on arm-none-linux-gnueabi. > > Christophe. > > 2012-08-23 Christophe Lyon > > gcc/ > * config/arm/arm.c (arm_evpc_neon_vext): New > function. > (arm_expand_vec_perm_const_1): Add call to > arm_evpc_neon_vext. > > gcc/testsuite/ > * gcc.target/arm/neon-vext.c: New tests.= > Has this been tested for big-endian? R.
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 24 August 2012 10:40, Richard Earnshaw wrote: > > Has this been tested for big-endian? > > R. No. I'll give a look at it and let you know. Christophe.
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
[ Richard, sorry for the duplicate message where I omitted the mailing-list] On 24 August 2012 10:40, Richard Earnshaw wrote: > > Has this been tested for big-endian? > Hi, While improving my tests and trying to turn them into execution tests, I realized that vector initialization such as: uint16x4_t vec1 = {0x1234, 0x5678, 0x9abc, 0xdef0}; is endianness dependent. However, I have noticed that other tests (such as neon-vrev.c, neon-vset_lanes8.c, pr48252) do use such constructs and the last two ones fail at execution in big-endian mode (the 1st one is only compiled). I guess that the 'right' (portable) was of initializing a vector is to load it from an array, right? Thanks, Christophe.
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 08/27/2012 08:02 AM, Christophe Lyon wrote: > [ Richard, sorry for the duplicate message where I omitted the mailing-list] > > On 24 August 2012 10:40, Richard Earnshaw wrote: >> >> Has this been tested for big-endian? >> > > Hi, > While improving my tests and trying to turn them into execution tests, > I realized that vector initialization such as: > uint16x4_t vec1 = {0x1234, 0x5678, 0x9abc, 0xdef0}; > is endianness dependent. > > However, I have noticed that other tests (such as neon-vrev.c, > neon-vset_lanes8.c, pr48252) do use such constructs and the last > two ones fail at execution in big-endian mode (the 1st one is only > compiled). > > I guess that the 'right' (portable) was of initializing a vector is to > load it from an array, right? See http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01114.html for Richard Earnshaw's suggestion on how to fix neon-vset_lanes8.c, and an alternate suggestion for changing the compiler. Janis
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 27 August 2012 21:29, Janis Johnson wrote: > On 08/27/2012 08:02 AM, Christophe Lyon wrote: >> I guess that the 'right' (portable) was of initializing a vector is to >> load it from an array, right? > > See http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01114.html for Richard > Earnshaw's suggestion on how to fix neon-vset_lanes8.c, and an alternate > suggestion for changing the compiler. > Thanks for the pointer, which confirms it is much more complex than I anticipated. However, I still need to initialize the mask vector with GCC-vector notation, not using vld1 otherwise the compiler cannot detect if the mask vector is contant (and therefore suitable for an optimization). This makes writing exhaustive, portable (big and little endian), executable tests a painful task. Thanks
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 28 August 2012 16:20, Christophe Lyon wrote: > This makes writing exhaustive, portable (big and little endian), > executable tests a painful task. > For instance, considering the attached sample code, I obtain a different result in big-endian vs little-endian, while the input values are the same. Indeed, in both cases, the program prints: __a[0] = 0 . __a[7] = 7 __b[0] = 8 __b[7] = 15 __mask1[0] = 2 __mask1[7] = 9 but in Little-endian, the result of builtin_shuffle(__a, __b, __mask1) is mem[0] = 2, mem[1] = 3 mem[7] = 9 while in big-endian it is: mem[0] =5, mem[1] = 4, mem[5] = 0, mem[6] = 15, mem[7] = 14 What am I missing? Thanks, Christophe. /* { dg-do run } */ /* { dg-require-effective-target arm_neon_ok } */ /* { dg-options "-O2" } */ /* { dg-add-options arm_neon } */ #include #include #include uint8x8_t tst_vext_u8 (uint8x8_t __a, uint8x8_t __b) { #ifdef __ARMEL__ uint8x8_t __mask1 = {2, 3, 4, 5, 6, 7, 8, 9}; #else uint8x8_t __mask1 = {9, 8, 7, 6, 5, 4, 3, 2}; #endif union {uint8x8_t v; uint8_t buf[8];} mem_u8x8; int i; vst1_u8(mem_u8x8.buf, __a); for(i=0; i<8; i++) { fprintf(stderr, "__a[%d]=%d\n", i, mem_u8x8.buf[i]); } vst1_u8(mem_u8x8.buf, __b); for(i=0; i<8; i++) { fprintf(stderr, "__b[%d]=%d\n", i, mem_u8x8.buf[i]); } vst1_u8(mem_u8x8.buf, __mask1); for(i=0; i<8; i++) { fprintf(stderr, "__mask1[%d]=%d\n", i, mem_u8x8.buf[i]); } return __builtin_shuffle ( __a, __b, __mask1) ; } int main(void) { uint8_t arr_u8x8[] = {0, 1, 2, 3, 4, 5, 6, 7}; uint8_t arr2_u8x8[] = {8, 9, 10, 11, 12, 13, 14, 15}; uint8x8_t vec_u8x8 = vld1_u8(arr_u8x8); uint8x8_t vec2_u8x8 = vld1_u8(arr2_u8x8); uint8x8_t result_u8x8; union {uint8x8_t v; uint8_t buf[8];} mem_u8x8; int i; result_u8x8 = tst_vext_u8 (vec_u8x8, vec2_u8x8); vst1_u8(mem_u8x8.buf, result_u8x8); for (i=0; i<8; i++) { printf("mem_u8x8[%d]=%d\n", i, mem_u8x8.buf[i]); } return 0; }
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 24 August 2012 10:54, Christophe Lyon wrote: > On 24 August 2012 10:40, Richard Earnshaw wrote: >> >> Has this been tested for big-endian? >> >> R. > > No. I'll give a look at it and let you know. > > Christophe. Here is an updated patch, which now does no optimization in the big-endian case. Given the current status of big-endian + neon support, I guess it is not a serious problem. I have also added runtime tests. I will later post an additional patch (on top of this one), which enhances the tests so that they can be run in big-endian mode too. Christophe. 2012-08-31 Christophe Lyon gcc/ * config/arm/arm.c (arm_evpc_neon_vext): New function. (arm_expand_vec_perm_const_1): Add call to arm_evpc_neon_vext. gcc/testsuite/ * gcc.target/arm/neon-vext.c gcc.target/arm/neon-vext-execute.c: New tests. gcc-vec-permute-vext.changelog Description: Binary data
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
This time with the actual patch attached. On 31 August 2012 15:23, Christophe Lyon wrote: > On 24 August 2012 10:54, Christophe Lyon wrote: >> On 24 August 2012 10:40, Richard Earnshaw wrote: >>> >>> Has this been tested for big-endian? >>> >>> R. >> >> No. I'll give a look at it and let you know. >> >> Christophe. > > Here is an updated patch, which now does no optimization in the > big-endian case. Given the current status of big-endian + neon > support, I guess it is not a serious problem. > > I have also added runtime tests. > > I will later post an additional patch (on top of this one), which > enhances the tests so that they can be run in big-endian mode too. > > Christophe. > > 2012-08-31 Christophe Lyon > > gcc/ > * config/arm/arm.c (arm_evpc_neon_vext): New > function. > (arm_expand_vec_perm_const_1): Add call to > arm_evpc_neon_vext. > > > gcc/testsuite/ > * gcc.target/arm/neon-vext.c > gcc.target/arm/neon-vext-execute.c: > New tests. gcc-vec-permute-vext.patch Description: Binary data
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 2012-08-31 07:25, Christophe Lyon wrote: > + offset = gen_rtx_CONST_INT (VOIDmode, location); Never call gen_rtx_CONST_INT directly. Use GEN_INT. r~
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 31 August 2012 17:59, Richard Henderson wrote: > On 2012-08-31 07:25, Christophe Lyon wrote: >> + offset = gen_rtx_CONST_INT (VOIDmode, location); > > Never call gen_rtx_CONST_INT directly. Use GEN_INT. > > Here is an updated patch with that small change. For the record, there are quite a few existing calls to gen_rtx_CONST_INT, maybe a cleanup pass is needed? Thanks, Christophe. 2012-09-03 Christophe Lyon gcc/ * config/arm/arm.c (arm_evpc_neon_vext): New function. (arm_expand_vec_perm_const_1): Add call to arm_evpc_neon_vext. gcc/testsuite/ * gcc.target/arm/neon-vext.c gcc.target/arm/neon-vext-execute.c: New tests. gcc-vec-permute-vext.patch Description: Binary data
Re: [PATCH, ARM] Constant vector permute for the Neon vext insn
On 09/03/12 09:59, Christophe Lyon wrote: On 31 August 2012 17:59, Richard Henderson wrote: On 2012-08-31 07:25, Christophe Lyon wrote: + offset = gen_rtx_CONST_INT (VOIDmode, location); Never call gen_rtx_CONST_INT directly. Use GEN_INT. Here is an updated patch with that small change. For the record, there are quite a few existing calls to gen_rtx_CONST_INT, maybe a cleanup pass is needed? A set of cleanup patches are welcome. This looks OK - thanks. Ramana