Hi Chung-Lin!
Thanks for your work here, which I'm beginning to look into (prerequisite
"[PATCH, OpenACC 2.7] Implement reductions for arrays and structs",
first, of course); it'll take me some time.
In non-offloading testing, I noticed for x86_64-pc-linux-gnu '-m32':
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O0 (test for excess errors)
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O0 execution test
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O1 (test for excess errors)
+FAIL: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O1 execution test
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors)
+FAIL: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O2 execution test
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions (test for excess errors)
+FAIL: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions execution test
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O3 -g (test for excess errors)
+FAIL: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -O3 -g execution test
+PASS: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -Os (test for excess errors)
+FAIL: libgomp.oacc-fortran/reduction-13.f90 -DACC_DEVICE_TYPE_host=1
-DACC_MEM_SHARED=1 -foffload=disable -Os execution test
With optimizations enabled, it runs into 'STOP 4'.
Per '-Wextra':
[...]/libgomp.oacc-fortran/reduction-13.f90:40:6: Warning: Inequality
comparison for REAL(4) at (1) [-Wcompare-reals]
[...]/libgomp.oacc-fortran/reduction-13.f90:63:6: Warning: Inequality
comparison for REAL(4) at (1) [-Wcompare-reals]
[...]/libgomp.oacc-fortran/reduction-13.f90:64:6: Warning: Inequality
comparison for REAL(8) at (1) [-Wcompare-reals]
Do we need to allow for some epsilon (generally in such test cases), or
is there another problem?
For reference:
On 2024-02-08T22:47:13+0800, Chung-Lin Tang wrote:
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-13.f90
> @@ -0,0 +1,66 @@
> +! { dg-do run }
> +
> +! record type reductions
> +
> +program reduction_13
> + implicit none
> +
> + type t1
> + integer :: i
> + real :: r
> + end type t1
> +
> + type t2
> + real :: r
> + integer :: i
> + double precision :: d
> + end type t2
> +
> + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
> + integer :: i
> + type(t1) :: v1, a1
> + type (t2) :: v2, a2
> +
> + v1%i = 0
> + v1%r = 0
> + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(v1)
> + !$acc loop reduction (+:v1)
> + do i = 1, n
> + v1%i = v1%i + 1
> + v1%r = v1%r + 2
> + end do
> + !$acc end parallel
> + a1%i = 0
> + a1%r = 0
> + do i = 1, n
> + a1%i = a1%i + 1
> + a1%r = a1%r + 2
> + end do
> + if (v1%i .ne. a1%i) STOP 1
> + if (v1%r .ne. a1%r) STOP 2
> +
> + v2%i = 1
> + v2%r = 1
> + v2%d = 1
> + !$acc parallel num_gangs(ng) num_workers(nw) vector_length(vl) copy(v2)
> + !$acc loop reduction (*:v2)
> + do i = 1, n
> + v2%i = v2%i * 2
> + v2%r = v2%r * 1.1
> + v2%d = v2%d * 1.3
> + end do
> + !$acc end parallel
> + a2%i = 1
> + a2%r = 1
> + a2%d = 1
> + do i = 1, n
> + a2%i = a2%i * 2
> + a2%r = a2%r * 1.1
> + a2%d = a2%d * 1.3
> + end do
> +
> + if (v2%i .ne. a2%i) STOP 3
> + if (v2%r .ne. a2%r) STOP 4
> + if (v2%d .ne. a2%d) STOP 5
> +
> +end program reduction_13
Grüße
Thomas