Re: [PATCH] internal-fn: Do not force vcond operand to reg.
> OK if that pre-commit CI works out. The CI didn't pick it up, guess it needs to be a bit more explicit. In the meanwhile, however, I managed to catch a short window with > 10G free on gcc185 => Bootstrap and regtest successful on aarch64. Going to push the patch later today. Regards Robin
Re: [PATCH] internal-fn: Do not force vcond operand to reg.
On Mon, May 13, 2024 at 4:14 PM Robin Dapp wrote: > > > What happens if we simply remove all of the force_reg here? > > On x86 I bootstrapped and tested the attached without fallout > (gcc188, so it's no avx512-native machine and therefore limited > coverage). riscv regtest is unchanged. > For aarch64 I would to rely on the pre-commit CI to pick it > up (does that work on sub-threads?). OK if that pre-commit CI works out. Richard. > Regards > Robin > > > gcc/ChangeLog: > > PR middle-end/113474 > > * internal-fn.cc (expand_vec_cond_mask_optab_fn): Remove > force_regs. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/pr113474.c: New test. > --- > gcc/internal-fn.cc | 3 --- > .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 + > 2 files changed, 13 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 2c764441cde..4d226c478b4 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall > *stmt, convert_optab optab) >rtx_op1 = expand_normal (op1); >rtx_op2 = expand_normal (op2); > > - mask = force_reg (mask_mode, mask); > - rtx_op1 = force_reg (mode, rtx_op1); > - >rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >create_output_operand ([0], target, mode); >create_input_operand ([1], rtx_op1, mode); > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > new file mode 100644 > index 000..0364bf9f5e3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile { target riscv_v } } */ > +/* { dg-additional-options "-std=c99" } */ > + > +void > +foo (int n, int **a) > +{ > + int b; > + for (b = 0; b < n; b++) > +for (long e = 8; e > 0; e--) > + a[b][e] = a[b][e] == 15; > +} > + > +/* { dg-final { scan-assembler "vmerge.vim" } } */ > -- > 2.45.0 >
Re: [PATCH] internal-fn: Do not force vcond operand to reg.
> What happens if we simply remove all of the force_reg here? On x86 I bootstrapped and tested the attached without fallout (gcc188, so it's no avx512-native machine and therefore limited coverage). riscv regtest is unchanged. For aarch64 I would to rely on the pre-commit CI to pick it up (does that work on sub-threads?). Regards Robin gcc/ChangeLog: PR middle-end/113474 * internal-fn.cc (expand_vec_cond_mask_optab_fn): Remove force_regs. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113474.c: New test. --- gcc/internal-fn.cc | 3 --- .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 + 2 files changed, 13 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 2c764441cde..4d226c478b4 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) rtx_op1 = expand_normal (op1); rtx_op2 = expand_normal (op2); - mask = force_reg (mask_mode, mask); - rtx_op1 = force_reg (mode, rtx_op1); - rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); create_output_operand ([0], target, mode); create_input_operand ([1], rtx_op1, mode); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c new file mode 100644 index 000..0364bf9f5e3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target riscv_v } } */ +/* { dg-additional-options "-std=c99" } */ + +void +foo (int n, int **a) +{ + int b; + for (b = 0; b < n; b++) +for (long e = 8; e > 0; e--) + a[b][e] = a[b][e] == 15; +} + +/* { dg-final { scan-assembler "vmerge.vim" } } */ -- 2.45.0
Re: [PATCH] internal-fn: Do not force vcond operand to reg.
On Mon, May 13, 2024 at 8:18 AM Robin Dapp wrote: > > > How does this make a difference in the end? I'd expect say forwprop to > > fix things? > > In general we try to only add the masking "boilerplate" of our > instructions at split time so fwprop, combine et al. can do their > work uninhibited of it (and we don't need numerous > (if_then_else ... (if_then_else) ...) combinations in our patterns). > A vec constant we expand directly to a masked representation, though > which makes further simplification difficult. I can experiment with > changing that if preferred. > > My thinking was, however, that for other operations like binops we > directly emit the right variant via expand_operands without > forcing to a reg and don't even need to fwprop so I wanted to > imitate that. Ah, so yeah, it probably makes sense for constants. Btw, there's prepare_operand which I think might be better for its CONST_INT handling? I can also see we usually do not bother with force_reg, the force_reg was added with the initial r6-4696-ga414c77f2a30bb already. What happens if we simply remove all of the force_reg here? Thanks, Richard. > Regards > Robin >
Re: [PATCH] internal-fn: Do not force vcond operand to reg.
> How does this make a difference in the end? I'd expect say forwprop to > fix things? In general we try to only add the masking "boilerplate" of our instructions at split time so fwprop, combine et al. can do their work uninhibited of it (and we don't need numerous (if_then_else ... (if_then_else) ...) combinations in our patterns). A vec constant we expand directly to a masked representation, though which makes further simplification difficult. I can experiment with changing that if preferred. My thinking was, however, that for other operations like binops we directly emit the right variant via expand_operands without forcing to a reg and don't even need to fwprop so I wanted to imitate that. Regards Robin
Re: [PATCH] internal-fn: Do not force vcond operand to reg.
On Fri, May 10, 2024 at 3:18 PM Robin Dapp wrote: > > Hi, > > this only forces the first comparison operator into a register if it is > not already suitable. > > Bootstrap and regtest is running on x86 and aarch64, successful on p10. > Regtested on riscv. How does this make a difference in the end? I'd expect say forwprop to fix things? > gcc/ChangeLog: > > PR middle-end/113474 > > * internal-fn.cc (expand_vec_cond_mask_optab_fn): Only force > op1 to reg if necessary. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/pr113474.c: New test. > > Regards > Robin > > --- > gcc/internal-fn.cc | 3 ++- > .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 + > 2 files changed, 15 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 2c764441cde..72cc6b7a1f7 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -3164,7 +3164,8 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall > *stmt, convert_optab optab) >rtx_op2 = expand_normal (op2); > >mask = force_reg (mask_mode, mask); > - rtx_op1 = force_reg (mode, rtx_op1); > + if (!insn_operand_matches (icode, 1, rtx_op1)) > +rtx_op1 = force_reg (mode, rtx_op1); > >rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >create_output_operand ([0], target, mode); > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > new file mode 100644 > index 000..0364bf9f5e3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile { target riscv_v } } */ > +/* { dg-additional-options "-std=c99" } */ > + > +void > +foo (int n, int **a) > +{ > + int b; > + for (b = 0; b < n; b++) > +for (long e = 8; e > 0; e--) > + a[b][e] = a[b][e] == 15; > +} > + > +/* { dg-final { scan-assembler "vmerge.vim" } } */ > -- > 2.45.0
[PATCH] internal-fn: Do not force vcond operand to reg.
Hi, this only forces the first comparison operator into a register if it is not already suitable. Bootstrap and regtest is running on x86 and aarch64, successful on p10. Regtested on riscv. gcc/ChangeLog: PR middle-end/113474 * internal-fn.cc (expand_vec_cond_mask_optab_fn): Only force op1 to reg if necessary. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113474.c: New test. Regards Robin --- gcc/internal-fn.cc | 3 ++- .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 + 2 files changed, 15 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 2c764441cde..72cc6b7a1f7 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3164,7 +3164,8 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, convert_optab optab) rtx_op2 = expand_normal (op2); mask = force_reg (mask_mode, mask); - rtx_op1 = force_reg (mode, rtx_op1); + if (!insn_operand_matches (icode, 1, rtx_op1)) +rtx_op1 = force_reg (mode, rtx_op1); rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); create_output_operand ([0], target, mode); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c new file mode 100644 index 000..0364bf9f5e3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target riscv_v } } */ +/* { dg-additional-options "-std=c99" } */ + +void +foo (int n, int **a) +{ + int b; + for (b = 0; b < n; b++) +for (long e = 8; e > 0; e--) + a[b][e] = a[b][e] == 15; +} + +/* { dg-final { scan-assembler "vmerge.vim" } } */ -- 2.45.0