Hi Andre,

Thanks for fixing this.

On 19/02/2021 10:53, Andre Vieira (lists) via Gcc-patches wrote:
> Hi,
> 
> This patch makes sure that allocno copies are not created for unordered
> modes. The testcases in the PR highlighted a case where an allocno copy was
> being created for:
> (insn 121 120 123 11 (parallel [
>             (set (reg:VNx2QI 217)
>                 (vec_duplicate:VNx2QI (subreg/s/v:QI (reg:SI 93 [ _2 ]) 0)))
>             (clobber (scratch:VNx16BI))
>         ]) 4750 {*vec_duplicatevnx2qi_reg}
>      (expr_list:REG_DEAD (reg:SI 93 [ _2 ])
>         (nil)))
> 
> As the compiler detected that the vec_duplicate<mode>_reg pattern allowed
> the input and output operand to be of the same register class, it tried to
> create an allocno copy for these two operands, stripping subregs in the
> process. However, this meant that the copy was between VNx2QI and SI, which
> have unordered mode precisions.
> 
> So at compile time we do not know which of the two modes is smaller which is
> a requirement when updating allocno copy costs.
> 
> Regression tested on aarch64-linux-gnu.
> 
> Is this OK for trunk (and after a week backport to gcc-10) ?
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..ee0c7b51602cacd45f9e33acecb1eaa9f9edebf2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c
> @@ -0,0 +1,12 @@
> +/* PR rtl-optimization/98791  */
> +/* { dg-do compile } */
> +/* { dg-options "-O -ftree-vectorize --param=aarch64-autovec-preference=3" } 
> */
> +extern char a[], b[];
> +short c, d;
> +long *e;
> +void f() {
> +  for (int g; g < c; g += 1) {
> +    a[g] = d;
> +    b[g] = e[g];
> +  }
> +}

For the testcase, you might want to use the one I posted most recently:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98791#c3
which avoids the dependency on the aarch64-autovec-preference param
(which is in GCC 11 only) as this will simplify backporting.

But if it's preferable to have a testcase without SVE intrinsics for GCC
11 then we should stick with that.

Cheers,
Alex

Reply via email to