[Bug fortran/85531] Implement some loop fusion in the Fortran front end
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85531 --- Comment #5 from rguenther at suse dot de --- On April 26, 2018 6:09:40 PM GMT+02:00, "tkoenig at gcc dot gnu.org"wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85531 > >--- Comment #4 from Thomas Koenig --- >What is the best strategy on this? > >I assume the Fortran front end could do a dependency analysis, >the existing code could be extended for this. > >We could then either do the scalarization in the front end, or >annotate the generated loops in some way to indicate that it >is OK to merge them. > >What would be preferred? Well. I think we need sth in the middle end. In the end the question will be whether that's good enough or whether the frontend can do better in some cases. We _do_ have issues with the frontend lowering everything to 1-dimensional accesses.
[Bug fortran/85531] Implement some loop fusion in the Fortran front end
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85531 --- Comment #4 from Thomas Koenig --- What is the best strategy on this? I assume the Fortran front end could do a dependency analysis, the existing code could be extended for this. We could then either do the scalarization in the front end, or annotate the generated loops in some way to indicate that it is OK to merge them. What would be preferred?
[Bug fortran/85531] Implement some loop fusion in the Fortran front end
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85531 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2018-04-26 CC||rguenth at gcc dot gnu.org Version|unknown |9.0 Ever confirmed|0 |1 --- Comment #3 from Richard Biener --- Thanks. So -floop-nest-optimize (graphite) doens't do anything here, it detects the two loops just fine but simply doesn't do any transform. Probably similar to the interchange failure we miss to provide it with spatial constraints to minimize or so. The loop distribution pass is presented with a CFG and IL that should be indeed trivially analyzable (if we solve the dependence analysis issue).
[Bug fortran/85531] Implement some loop fusion in the Fortran front end
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85531 --- Comment #1 from Richard Biener --- Can you provide a testcase that can be compiled? --- Comment #2 from Thomas Koenig --- Here it is. The internal writes are there just to confuse the optimizer. module x implicit none contains subroutine foo(a,b,c, n) integer, intent(in) :: n double precision, dimension(n), intent(in) :: a double precision, dimension(n), intent(out) :: b,c b = a c = a end subroutine foo subroutine bar(a,b,c,n) integer, intent(in) :: n double precision, dimension(n), intent(in) :: a double precision, dimension(n), intent(out) :: b,c integer :: i do concurrent (i=1:n) b(i) = a(i) c(i) = a(i) end do end subroutine bar end module x program main use x implicit none double precision, dimension(:), allocatable :: a, b, c integer, parameter :: n = 10**7 double precision :: t1, t2 character(len=80) :: line, line2 integer :: i allocate (a(n), b(n), c(n)) call random_number(a) line = '20' call cpu_time(t1) call foo(a,b,c,n) call cpu_time(t2) print *,t2-t1 read (unit=line,fmt=*) i write (unit=line2, fmt=*) b(i),c(i) line = '20' call cpu_time(t1) call bar(a,b,c,n) call cpu_time(t2) print *,t2-t1 read (unit=line,fmt=*) i write (unit=line2, fmt=*) b(i),c(i) end program main