Hi Andre, Thanks for fixing this.
On 19/02/2021 10:53, Andre Vieira (lists) via Gcc-patches wrote: > Hi, > > This patch makes sure that allocno copies are not created for unordered > modes. The testcases in the PR highlighted a case where an allocno copy was > being created for: > (insn 121 120 123 11 (parallel [ > (set (reg:VNx2QI 217) > (vec_duplicate:VNx2QI (subreg/s/v:QI (reg:SI 93 [ _2 ]) 0))) > (clobber (scratch:VNx16BI)) > ]) 4750 {*vec_duplicatevnx2qi_reg} > (expr_list:REG_DEAD (reg:SI 93 [ _2 ]) > (nil))) > > As the compiler detected that the vec_duplicate<mode>_reg pattern allowed > the input and output operand to be of the same register class, it tried to > create an allocno copy for these two operands, stripping subregs in the > process. However, this meant that the copy was between VNx2QI and SI, which > have unordered mode precisions. > > So at compile time we do not know which of the two modes is smaller which is > a requirement when updating allocno copy costs. > > Regression tested on aarch64-linux-gnu. > > Is this OK for trunk (and after a week backport to gcc-10) ? > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c > b/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..ee0c7b51602cacd45f9e33acecb1eaa9f9edebf2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr98791.c > @@ -0,0 +1,12 @@ > +/* PR rtl-optimization/98791 */ > +/* { dg-do compile } */ > +/* { dg-options "-O -ftree-vectorize --param=aarch64-autovec-preference=3" } > */ > +extern char a[], b[]; > +short c, d; > +long *e; > +void f() { > + for (int g; g < c; g += 1) { > + a[g] = d; > + b[g] = e[g]; > + } > +} For the testcase, you might want to use the one I posted most recently: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98791#c3 which avoids the dependency on the aarch64-autovec-preference param (which is in GCC 11 only) as this will simplify backporting. But if it's preferable to have a testcase without SVE intrinsics for GCC 11 then we should stick with that. Cheers, Alex