Issue 110611
Summary [Flang][LAA] TSVC s2101, s233: not vectorized because the extents of arrays are not constant
Labels loopoptim, vectorization, flang
Assignees
Reporter yus3710-fj
    Flang can't vectorize the loops in `s2101` and `s233` of [TSVC](https://www.netlib.org/benchmark/vectors) while Clang can vectorize the loops written in C.
(Clang doesn't actually vectorize the loops because the vectorization of strided accesses is less beneficial.)

* Fortran
```fortran
! Fortran version
      subroutine s2101(ntimes,ld,n,ctime,dtime,a,b,c,d,e,aa,bb,cc)

      integer ntimes, ld, n, i, nl
      real a(n), b(n), c(n), d(n), e(n), aa(ld,n), bb(ld,n), cc(ld,n)

      call init(ld,n,a,b,c,d,e,aa,bb,cc,'s2101')
      do 10 i = 1,n
 aa(i,i) = aa(i,i) + bb(i,i) * cc(i,i)
   10 continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
      end
```
```console
$ flang-new -v -O3 -flang-experimental-integer-overflow s2101.f -S -Rpass=vector -Rpass-analysis=vector -Rpass-missed=vector
flang-new version 20.0.0git (https://github.com/llvm/llvm-project.git 2c770675ce36402b51a320ae26f369690c138dc1)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/build/bin
Build config: +assertions
Found candidate GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Selected GCC installation: /usr/lib/gcc/aarch64-redhat-linux/11
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/build/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu generic -target-feature +outline-atomics -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -fversion-loops-for-stride -flang-experimental-integer-overflow -Rpass=vector -Rpass-analysis=vector -Rpass-missed=vector -resource-dir /path/to/build/lib/clang/20 -mframe-pointer=non-leaf -O3 -o /dev/null -x f95-cpp-input s2101.f
path/to/s2101.f:9:10: remark: loop not vectorized: unsafe dependent memory operations in loop. Use #pragma clang loop distribute(enable) to allow loop distribution to attempt to isolate the offending operations into a separate loop
Unsafe indirect dependence. Memory location is the same as accessed at s2101.f:9:10 [-Rpass-analysis=loop-vectorize]
path/to/s2101.f:8:7: remark: loop not vectorized [-Rpass-missed=loop-vectorize]
```

* C
```c
// C version
#define LEN 32000
#define LEN2 256
float a[LEN], b[LEN], c[LEN], d[LEN], e[LEN];
float aa[LEN2][LEN2], bb[LEN2][LEN2], cc[LEN2][LEN2];

int s2101() {
  init( "s2101");
  for (int i = 0; i < LEN2; i++) {
    aa[i][i] += bb[i][i] * cc[i][i];
  }
 dummy(a, b, c, d, e, aa, bb, cc, 0.);
  return 0;
}
```
```console
$ clang -O3 s2101.c -S -Rpass=vector -Rpass-analysis=vector -Rpass-missed=vector
s2101.c:9:3: remark: the cost-model indicates that vectorization is not beneficial [-Rpass-analysis=loop-vectorize]
    9 |                 for (int i = 0; i < LEN2; i++) {
      |                 ^
s2101.c:9:3: remark: interleaved loop (interleaved count: 2) [-Rpass=loop-vectorize]
```

In Fortran, extents of arrays are sometimes not constant in compilation time. On the other hand, LAA requires that the pointer stride is constant.
I suspect the constraint is too restrictive. IIUC, it is sufficient for vectorization that the pointer stride is loop-invariant and never gets zero. SCEV can tell that but LAA doesn't check it at the moment.
(This might be resolved by MLIR or the polyhedral model.)

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to