https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 31 Jul 2019, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579
> 
> --- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
> Transform second loop as 
> 
> diff --git a/loop.c b/loop.c
> index feea9ea..81a3ea6 100644
> --- a/loop.c
> +++ b/loop.c
> @@ -9,6 +9,6 @@ loop (int k, double x)
>    for (i=0;i<6;i++)
>      r[i] = x * a[i + k];
>    for (i=0;i<6;i++)
> -    t+=r[5-i];
> +    t+=r[i]; -------- using ascending order, align with former loop.
>    return t;
>  }
> }
> 
> Can avoid store forward stalls.
> 
> Before loop transform:
> 
> loop_avx256: 3710992
> loop       : 671995
> loop_avx128: 650882
> 
> After loop transform:
> 
> loop_avx256: 661386
> loop       : 652932
> loop_avx128: 568710

Since the loop is probably unrolled this would be a task for
reassociation which should try to make data dependences in a way
the scheduler can then order memory accesses in advancing order
without increasing register pressure (would also help using pre/post-inc
addressing modes on some targets).  Currently operand rank for
memory accesses is determined by looking at the rank of SSA uses
(which there may be none) only.

Reply via email to