Re: [ARM] Fix vget_lane for big-endian targets
On 21 July 2015 at 16:01, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: On 16/07/15 08:56, Christophe Lyon wrote: AdvSIMD vget_lane tests currently fail on armeb targets when dealing with vectors of 2 64-bits elements. This patches fixes it, by adding a code fragment similar to what is dones in other cases. I could have simplified it a bit given that the vector width is known, but I chose to hardcode 'reg_nelts = 2' to keep the code closer to what is done elsewhere. OK for trunk? Christophe 2015-07-16 Christophe Lyon christophe.l...@linaro.org * config/arm/neon.md (neon_vget_lanev2di): Handle big-endian targets. I see we do this for other lanewise patterns as well. Has this been tested on an arm big-endian target? If so, ok for trunk. I forgot to mention that yes, I actually tested it on arm big-endian, using QEMU. Christophe. Thanks, Kyrill diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 654d9d5..59ddc5b 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2736,6 +2736,19 @@ (match_operand:SI 2 immediate_operand )] TARGET_NEON { + if (BYTES_BIG_ENDIAN) +{ + /* The intrinsics are defined in terms of a model where the +element ordering in memory is vldm order, whereas the generic +RTL is defined in terms of a model where the element ordering +in memory is array order. Convert the lane number to conform +to this model. */ + unsigned int elt = INTVAL (operands[2]); + unsigned int reg_nelts = 2; + elt ^= reg_nelts - 1; + operands[2] = GEN_INT (elt); +} + switch (INTVAL (operands[2])) { case 0:
Re: [ARM] Fix vget_lane for big-endian targets
On 4 August 2015 at 14:09, Christophe Lyon christophe.l...@linaro.org wrote: On 21 July 2015 at 16:01, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: On 16/07/15 08:56, Christophe Lyon wrote: AdvSIMD vget_lane tests currently fail on armeb targets when dealing with vectors of 2 64-bits elements. This patches fixes it, by adding a code fragment similar to what is dones in other cases. I could have simplified it a bit given that the vector width is known, but I chose to hardcode 'reg_nelts = 2' to keep the code closer to what is done elsewhere. OK for trunk? Christophe 2015-07-16 Christophe Lyon christophe.l...@linaro.org * config/arm/neon.md (neon_vget_lanev2di): Handle big-endian targets. I see we do this for other lanewise patterns as well. Has this been tested on an arm big-endian target? If so, ok for trunk. I forgot to mention that yes, I actually tested it on arm big-endian, using QEMU. Since Alan committed his patch, there was a conflict with mine. Here is what I committed, the change being obvious enough IMO. (I did re-run make check on armeb using qemu) Christophe Christophe. Thanks, Kyrill diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 654d9d5..59ddc5b 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2736,6 +2736,19 @@ (match_operand:SI 2 immediate_operand )] TARGET_NEON { + if (BYTES_BIG_ENDIAN) +{ + /* The intrinsics are defined in terms of a model where the +element ordering in memory is vldm order, whereas the generic +RTL is defined in terms of a model where the element ordering +in memory is array order. Convert the lane number to conform +to this model. */ + unsigned int elt = INTVAL (operands[2]); + unsigned int reg_nelts = 2; + elt ^= reg_nelts - 1; + operands[2] = GEN_INT (elt); +} + switch (INTVAL (operands[2])) { case 0: 2015-08-04 Christophe Lyon christophe.l...@linaro.org * config/arm/neon.md (neon_vget_lanev2di): Handle big-endian targets. diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 4af74ce..b1bf26a 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2731,7 +2731,22 @@ (match_operand:SI 2 immediate_operand )] TARGET_NEON { - int lane = INTVAL (operands[2]); + int lane; + +if (BYTES_BIG_ENDIAN) +{ + /* The intrinsics are defined in terms of a model where the + element ordering in memory is vldm order, whereas the generic + RTL is defined in terms of a model where the element ordering + in memory is array order. Convert the lane number to conform + to this model. */ + unsigned int elt = INTVAL (operands[2]); + unsigned int reg_nelts = 2; + elt ^= reg_nelts - 1; + operands[2] = GEN_INT (elt); +} + + lane = INTVAL (operands[2]); gcc_assert ((lane ==0) || (lane == 1)); emit_move_insn (operands[0], lane == 0 ? gen_lowpart (DImode, operands[1])
Re: [ARM] Fix vget_lane for big-endian targets
On 16/07/15 08:56, Christophe Lyon wrote: AdvSIMD vget_lane tests currently fail on armeb targets when dealing with vectors of 2 64-bits elements. This patches fixes it, by adding a code fragment similar to what is dones in other cases. I could have simplified it a bit given that the vector width is known, but I chose to hardcode 'reg_nelts = 2' to keep the code closer to what is done elsewhere. OK for trunk? Christophe 2015-07-16 Christophe Lyon christophe.l...@linaro.org * config/arm/neon.md (neon_vget_lanev2di): Handle big-endian targets. I see we do this for other lanewise patterns as well. Has this been tested on an arm big-endian target? If so, ok for trunk. Thanks, Kyrill diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 654d9d5..59ddc5b 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2736,6 +2736,19 @@ (match_operand:SI 2 immediate_operand )] TARGET_NEON { + if (BYTES_BIG_ENDIAN) +{ + /* The intrinsics are defined in terms of a model where the +element ordering in memory is vldm order, whereas the generic +RTL is defined in terms of a model where the element ordering +in memory is array order. Convert the lane number to conform +to this model. */ + unsigned int elt = INTVAL (operands[2]); + unsigned int reg_nelts = 2; + elt ^= reg_nelts - 1; + operands[2] = GEN_INT (elt); +} + switch (INTVAL (operands[2])) { case 0: