subject:"\[PATCH\]\[ARM\] Add movv4hf\/v8hf expanders \& later insns; disable VnHF immediates."

Re: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

2016-01-18 Thread Christophe Lyon

On 18 January 2016 at 14:12, Kyrill Tkachov  wrote:
> Hi Alan,
>
>
> On 18/01/16 12:14, Alan Lawrence wrote:
>>
>> This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
>> check_effective_target_arm_neon_fp_16_ok.
>>
>> At present, without the expander, moving v4hf/v8hf values around is done
>> via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On
>> arm-*,
>> moving via two subregs is less efficient than one native insn!)
>>
>> However, adding the expanders, reveals a latent bug in the V4HF variant of
>> *neon_mov, that vector constants are not handled properly in the
>> neon_valid_immediate code. Hence, for now I've used a separate expander
>> that
>> disallows immediates, and disabled VnHF vectors as immediates in
>> neon_valid_immediate_for_move; I'll file a PR for this.
>>
>> Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector
>> modes to
>> the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
>> identical HF elements is now done by loading the scalar + vdup rather than
>> forcing the vector out to the constant pool.
>>
>> On armeb, one of the ICEs this fixes, is in the test program for
>> check_effective_target_arm_neon_fp_16_ok. This means the
>> advsimd-intrinsics
>> vcvt_f16 test now runs (and passes), and also that the other tests now run
>> with neon-fp16, rather than only neon as previously (on armeb).
>> This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
>> failing. Since those tests would previously have failed *if fp16 had been
>> passed in*, I think this is still a step forward; one can still run the
>> tests
>> with an explicit non-fp16 multilib if the old behaviour is desired.
>>
>> Note the previous patch removes other uses of VQXMOV (not strictly a
>> dependency,
>> generating V4HF/V8HF reinterpret patterns is harmless, they just aren't
>> used).
>>
>> Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
>> cross-tested armeb-none-eabi.
>
>
> Seems that you and Christophe have some duplicated work here?
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01031.html
>

Indeed, that's unfortunate :-(
I had filed PR 68620 to avoid duplication.

Both patches look similar, fortunately.

Christophe.


> Kyrill
>
>
>> gcc/ChangeLog:
>>
>> * config/arm/arm.c (neon_valid_immediate): Disallow vectors of
>> HFmode.
>> * config/arm/iterators.md (V_HF): New.
>> (VQXMOV): Add V8HF.
>> (VX): Add V4HF, V8HF.
>> (V_reg): Add cases for V4HF, V8HF.
>> * config/arm/vec-common.md (mov V_HF): New.
>> ---
>>   gcc/config/arm/arm.c |  2 ++
>>   gcc/config/arm/iterators.md  |  8 ++--
>>   gcc/config/arm/vec-common.md | 20 
>>   3 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 3276b03..4fdba38 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode,
>> int inverse,
>> /* Vectors of float constants.  */
>> if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
>>   {
>> +  if (GET_MODE_INNER (mode) == HFmode)
>> +   return -1;
>> rtx el0 = CONST_VECTOR_ELT (op, 0);
>> const REAL_VALUE_TYPE *r0;
>>   diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
>> index 974cf51..c5db868 100644
>> --- a/gcc/config/arm/iterators.md
>> +++ b/gcc/config/arm/iterators.md
>> @@ -59,6 +59,9 @@
>>   ;; Integer and float modes supported by Neon and IWMMXT.
>>   (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI
>> V4SF])
>>   +;; Vectors of half-precision floats.
>> +(define_mode_iterator V_HF [V4HF V8HF])
>> +
>>   ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
>>   (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
>>   @@ -99,7 +102,7 @@
>>   (define_mode_iterator VQI [V16QI V8HI V4SI])
>> ;; Quad-width vector modes, with TImode added, for moves.
>> -(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
>> +(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
>> ;; Opaque structure types wider than TImode.
>>   (define_mode_iterator VSTRUCT [EI OI CI XI])
>> @@ -160,7 +163,7 @@
>>   (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
>> ;; Modes with 8-bit and 16-bit elements.
>> -(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
>> +(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
>> ;; Modes with 8-bit elements.
>>   (define_mode_iterator VE [V8QI V16QI])
>> @@ -428,6 +431,7 @@
>>   ;; Register width from element mode
>>   (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
>>(V4HI "P") (V8HI  "q")
>> +(V4HF "P") (V8HF  "q")
>>(V2SI "P") (V4SI  "q")
>>(V2SF "P") (V4SF  "q")
>>(DI   "P") (V2DI  "q")
>> diff --git a/gcc/config/arm/v

Re: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

2016-01-18 Thread Kyrill Tkachov


Hi Alan,

On 18/01/16 12:14, Alan Lawrence wrote:

This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
check_effective_target_arm_neon_fp_16_ok.

At present, without the expander, moving v4hf/v8hf values around is done
via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On arm-*,
moving via two subregs is less efficient than one native insn!)

However, adding the expanders, reveals a latent bug in the V4HF variant of
*neon_mov, that vector constants are not handled properly in the
neon_valid_immediate code. Hence, for now I've used a separate expander that
disallows immediates, and disabled VnHF vectors as immediates in
neon_valid_immediate_for_move; I'll file a PR for this.

Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector modes to
the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
identical HF elements is now done by loading the scalar + vdup rather than
forcing the vector out to the constant pool.

On armeb, one of the ICEs this fixes, is in the test program for
check_effective_target_arm_neon_fp_16_ok. This means the advsimd-intrinsics
vcvt_f16 test now runs (and passes), and also that the other tests now run
with neon-fp16, rather than only neon as previously (on armeb).
This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
failing. Since those tests would previously have failed *if fp16 had been
passed in*, I think this is still a step forward; one can still run the tests
with an explicit non-fp16 multilib if the old behaviour is desired.

Note the previous patch removes other uses of VQXMOV (not strictly a dependency,
generating V4HF/V8HF reinterpret patterns is harmless, they just aren't used).

Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
cross-tested armeb-none-eabi.


Seems that you and Christophe have some duplicated work here?
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01031.html

Kyrill


gcc/ChangeLog:

* config/arm/arm.c (neon_valid_immediate): Disallow vectors of HFmode.
* config/arm/iterators.md (V_HF): New.
(VQXMOV): Add V8HF.
(VX): Add V4HF, V8HF.
(V_reg): Add cases for V4HF, V8HF.
* config/arm/vec-common.md (mov V_HF): New.
---
  gcc/config/arm/arm.c |  2 ++
  gcc/config/arm/iterators.md  |  8 ++--
  gcc/config/arm/vec-common.md | 20 
  3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3276b03..4fdba38 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode, int 
inverse,
/* Vectors of float constants.  */
if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
  {
+  if (GET_MODE_INNER (mode) == HFmode)
+   return -1;
rtx el0 = CONST_VECTOR_ELT (op, 0);
const REAL_VALUE_TYPE *r0;
  
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md

index 974cf51..c5db868 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -59,6 +59,9 @@
  ;; Integer and float modes supported by Neon and IWMMXT.
  (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
  
+;; Vectors of half-precision floats.

+(define_mode_iterator V_HF [V4HF V8HF])
+
  ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
  (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
  
@@ -99,7 +102,7 @@

  (define_mode_iterator VQI [V16QI V8HI V4SI])
  
  ;; Quad-width vector modes, with TImode added, for moves.

-(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
+(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
  
  ;; Opaque structure types wider than TImode.

  (define_mode_iterator VSTRUCT [EI OI CI XI])
@@ -160,7 +163,7 @@
  (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
  
  ;; Modes with 8-bit and 16-bit elements.

-(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
+(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
  
  ;; Modes with 8-bit elements.

  (define_mode_iterator VE [V8QI V16QI])
@@ -428,6 +431,7 @@
  ;; Register width from element mode
  (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
   (V4HI "P") (V8HI  "q")
+(V4HF "P") (V8HF  "q")
   (V2SI "P") (V4SI  "q")
   (V2SF "P") (V4SF  "q")
   (DI   "P") (V2DI  "q")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index ce98f71..c27578a 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -38,6 +38,26 @@
  }
  })
  
+;; This exists separately from the above pattern to exclude an immediate RHS.

+
+(define_expand "mov"
+  [(set (match_operand:V_HF 0 "nonimmediate_operand" "")
+   (match_operand:V_HF 1 "nonimmediate_operand" ""))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (mode))"
+{
+  if

[PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

2016-01-18 Thread Alan Lawrence

This fixes ICEs on armeb for float16x[48]_t vectors, e.g. in
check_effective_target_arm_neon_fp_16_ok.

At present, without the expander, moving v4hf/v8hf values around is done
via subregs. On armeb, this ICEs because REG_CANNOT_CHANGE_MODE_P. (On arm-*,
moving via two subregs is less efficient than one native insn!)

However, adding the expanders, reveals a latent bug in the V4HF variant of
*neon_mov, that vector constants are not handled properly in the
neon_valid_immediate code. Hence, for now I've used a separate expander that
disallows immediates, and disabled VnHF vectors as immediates in
neon_valid_immediate_for_move; I'll file a PR for this.

Also to fix the advsimd-intrinsics/vcombine test I had to add HF vector modes to
the VX iterator and V_reg attribute, for vdup_n, as loading a vector of
identical HF elements is now done by loading the scalar + vdup rather than
forcing the vector out to the constant pool.

On armeb, one of the ICEs this fixes, is in the test program for
check_effective_target_arm_neon_fp_16_ok. This means the advsimd-intrinsics
vcvt_f16 test now runs (and passes), and also that the other tests now run
with neon-fp16, rather than only neon as previously (on armeb).
This reveals that the fp16 cases of vld1_lane and vset_lane are (and were)
failing. Since those tests would previously have failed *if fp16 had been
passed in*, I think this is still a step forward; one can still run the tests
with an explicit non-fp16 multilib if the old behaviour is desired.

Note the previous patch removes other uses of VQXMOV (not strictly a dependency,
generating V4HF/V8HF reinterpret patterns is harmless, they just aren't used).

Bootstrapped + check-gcc on arm-none-linux-gnueabihf;
cross-tested armeb-none-eabi.

gcc/ChangeLog:

* config/arm/arm.c (neon_valid_immediate): Disallow vectors of HFmode.
* config/arm/iterators.md (V_HF): New.
(VQXMOV): Add V8HF.
(VX): Add V4HF, V8HF.
(V_reg): Add cases for V4HF, V8HF.
* config/arm/vec-common.md (mov V_HF): New.
---
 gcc/config/arm/arm.c |  2 ++
 gcc/config/arm/iterators.md  |  8 ++--
 gcc/config/arm/vec-common.md | 20 
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3276b03..4fdba38 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12371,6 +12371,8 @@ neon_valid_immediate (rtx op, machine_mode mode, int 
inverse,
   /* Vectors of float constants.  */
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
 {
+  if (GET_MODE_INNER (mode) == HFmode)
+   return -1;
   rtx el0 = CONST_VECTOR_ELT (op, 0);
   const REAL_VALUE_TYPE *r0;
 
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 974cf51..c5db868 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -59,6 +59,9 @@
 ;; Integer and float modes supported by Neon and IWMMXT.
 (define_mode_iterator VALL [V2DI V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
 
+;; Vectors of half-precision floats.
+(define_mode_iterator V_HF [V4HF V8HF])
+
 ;; Integer and float modes supported by Neon and IWMMXT, except V2DI.
 (define_mode_iterator VALLW [V2SI V4HI V8QI V2SF V4SI V8HI V16QI V4SF])
 
@@ -99,7 +102,7 @@
 (define_mode_iterator VQI [V16QI V8HI V4SI])
 
 ;; Quad-width vector modes, with TImode added, for moves.
-(define_mode_iterator VQXMOV [V16QI V8HI V4SI V4SF V2DI TI])
+(define_mode_iterator VQXMOV [V16QI V8HI V8HF V4SI V4SF V2DI TI])
 
 ;; Opaque structure types wider than TImode.
 (define_mode_iterator VSTRUCT [EI OI CI XI])
@@ -160,7 +163,7 @@
 (define_mode_iterator VMDQI [V4HI V2SI V8HI V4SI])
 
 ;; Modes with 8-bit and 16-bit elements.
-(define_mode_iterator VX [V8QI V4HI V16QI V8HI])
+(define_mode_iterator VX [V8QI V4HI V4HF V16QI V8HI V8HF])
 
 ;; Modes with 8-bit elements.
 (define_mode_iterator VE [V8QI V16QI])
@@ -428,6 +431,7 @@
 ;; Register width from element mode
 (define_mode_attr V_reg [(V8QI "P") (V16QI "q")
  (V4HI "P") (V8HI  "q")
+(V4HF "P") (V8HF  "q")
  (V2SI "P") (V4SI  "q")
  (V2SF "P") (V4SF  "q")
  (DI   "P") (V2DI  "q")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index ce98f71..c27578a 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -38,6 +38,26 @@
 }
 })
 
+;; This exists separately from the above pattern to exclude an immediate RHS.
+
+(define_expand "mov"
+  [(set (match_operand:V_HF 0 "nonimmediate_operand" "")
+   (match_operand:V_HF 1 "nonimmediate_operand" ""))]
+  "TARGET_NEON
+   || (TARGET_REALLY_IWMMXT && VALID_IWMMXT_REG_MODE (mode))"
+{
+  if (can_create_pseudo_p ())
+{
+  if (!REG_P (operands[0]))
+   operands[1] = force_reg (mode, operands[1]);
+  else if (TARGET_NEON && CONSTANT_P (operands[1]))
+   {
+ operands[1] = neon_make_constant

Re: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

Re: [PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

[PATCH][ARM] Add movv4hf/v8hf expanders & later insns; disable VnHF immediates.

3 matches

Site Navigation

Mail list logo

Footer information